Actions - llama.cpp - Gitea: Git with a cup of tea

root/llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2024-12-24 18:34:36 +00:00

ggml : fix const usage in SSE path (#10962)

#1568:Scheduled

master

2024-12-24 04:12:32 +00:00

0s

llama : support InfiniAI Megrez 3b (#10893)

#1557:Scheduled

master

2024-12-24 04:12:32 +00:00

0s

convert : add BertForMaskedLM (#10919)

#1550:Scheduled

master

2024-12-23 04:12:32 +00:00

0s

ggml-cpu: replace NEON asm with intrinsics in ggml_gemv_q4_0_4x8_q8_0() (#10874)

#1545:Scheduled

master

2024-12-22 04:12:32 +00:00

0s

clip : disable GPU support (#10896)

#1536:Scheduled

master

2024-12-21 04:12:32 +00:00

0s

ggml : fix arm build (#10890)

#1527:Scheduled

master

2024-12-20 04:12:32 +00:00

0s

Use model->gguf_kv for loading the template instead of using the C API. (#10868)

#1512:Scheduled

master

2024-12-19 04:12:32 +00:00

0s

rwkv6: add wkv6 support for Vulkan backend (#10829)

#1499:Scheduled

master

2024-12-18 04:12:32 +00:00

0s

llava : Allow locally downloaded models for QwenVL (#10833)

#1495:Scheduled

master

2024-12-17 04:12:32 +00:00

0s

nix: allow to override rocm gpu targets (#10794)

#1482:Scheduled

master

2024-12-16 04:12:32 +00:00

0s

Introducing experimental OpenCL backend with support for Qualcomm Adreno GPUs (#10693)

#1475:Scheduled

master

2024-12-15 04:12:32 +00:00

0s

contrib : add ngxson as codeowner (#10804)

#1470:Scheduled

master

2024-12-14 04:12:49 +00:00

0s

server : (UI) add tok/s, get rid of completion.js (#10786)

#1404:Scheduled

master

2024-12-13 04:12:49 +00:00

0s

ggml: load all backends from a user-provided search path (#10699)

#1399:Scheduled

master

2024-12-12 04:12:49 +00:00

0s

CUDA: fix shared memory access condition for mmv (#10740)

#1393:Scheduled

master

2024-12-11 04:12:49 +00:00

0s

server : fix format_infill (#10724)

#1391:Scheduled

master

2024-12-10 04:12:49 +00:00

0s

llama : add 128k yarn context for Qwen (#10698)

#1385:Scheduled

master

2024-12-09 04:12:49 +00:00

0s

convert : add custom attention mapping

#1371:Scheduled

master

2024-12-08 04:12:49 +00:00

0s

fix(server) : not show alert when DONE is received (#10674)

#1365:Scheduled

master

2024-12-07 04:12:49 +00:00

0s

Update deprecation-warning.cpp (#10619)

#1356:Scheduled

master

2024-12-06 04:12:49 +00:00

0s

clip : add sycl support (#10574)

#1345:Scheduled

master

2024-12-05 04:12:49 +00:00

0s

llama : add enum for built-in chat templates (#10623)

#1341:Scheduled

master

2024-12-04 04:12:49 +00:00

0s

Add `mistral-v1`, `mistral-v3`, `mistral-v3-tekken` and `mistral-v7` chat template types (#10572)

#1333:Scheduled

master

2024-12-03 04:12:49 +00:00

0s

ggml-cpu: replace AArch64 NEON assembly with intrinsics in ggml_gemv_q4_0_4x4_q8_0() (#10567)

#1331:Scheduled

master

2024-12-02 04:12:49 +00:00

0s

ggml : move AMX to the CPU backend (#10570)

#1328:Scheduled

master

2024-12-01 04:12:49 +00:00

0s

ggml : remove redundant copyright notice + update authors

#1322:Scheduled

master

2024-11-30 04:12:49 +00:00

0s

Add some minimal optimizations for CDNA (#10498)

#1312:Scheduled

master

2024-11-29 04:12:49 +00:00

0s

Add OLMo 2 model in docs (#10530)

#1310:Scheduled

master

2024-11-28 04:12:49 +00:00

0s

server : add more information about error (#10455)

#1261:Commit 9fd8c2687f pushed by root

master

2024-11-27 02:09:49 +00:00

0s

[SYCL] Fix building Win package for oneAPI 2025.0 update (#10483)

#1247:Commit 5a8987793f pushed by root

master

2024-11-25 20:29:54 +00:00

0s

1 2 3 4 5 ...