This website requires JavaScript.
Explore
Help
Sign In
root
/
llama.cpp
Watch
1
Star
0
Fork
0
You've already forked llama.cpp
mirror of
https://github.com/ggerganov/llama.cpp.git
synced
2024-09-22 21:16:20 +00:00
Code
Issues
Actions
16
Packages
Projects
Releases
Wiki
Activity
All Workflows
build.yml
close-issue.yml
docker.yml
editorconfig.yml
gguf-publish.yml
labeler.yml
nix-ci-aarch64.yml
nix-ci.yml
nix-flake-update.yml
nix-publish-flake.yml
python-check-requirements.yml
python-lint.yml
python-type-check.yml
server.yml
Actor
All actors
root
Status
All status
success
failure
waiting
running
musa: enable building fat binaries, enable unified memory, and disable Flash Attention on QY1 (MTT S80) (#9526)
#218
:
Commit
c35e586ea5
pushed by
root
master
2024-09-22 21:16:20 +00:00
0s
CUDA: enable Gemma FA for HIP/Pascal (#9581)
#212
:
Commit
a5b57b08ce
pushed by
root
master
2024-09-22 21:16:20 +00:00
0s
llama: remove redundant loop when constructing ubatch (#9574)
#203
:
Commit
ecd5d6b65b
pushed by
root
master
2024-09-22 13:06:19 +00:00
0s
ggml-alloc : fix list of allocated tensors with GGML_ALLOCATOR_DEBUG (#9573)
#197
:
Commit
d09770cae7
pushed by
root
master
2024-09-22 04:56:20 +00:00
0s
Update CUDA graph on scale change plus clear nodes/params (#9550)
#190
:
Commit
41f477879f
pushed by
root
master
2024-09-21 12:36:19 +00:00
0s
quantize : improve type name parsing (#9570)
#184
:
Commit
63351143b2
pushed by
root
master
2024-09-21 04:26:20 +00:00
0s
examples : flush log upon ctrl+c (#9559)
#178
:
Commit
d39e26741f
pushed by
root
master
2024-09-20 20:16:21 +00:00
0s
server : clean-up completed tasks from waiting list (#9531)
#169
:
Commit
6026da52d6
pushed by
root
master
2024-09-20 12:06:20 +00:00
0s
ggml : fix n_threads_cur initialization with one thread (#9538)
#159
:
Commit
64c6af3195
pushed by
root
master
2024-09-19 11:36:20 +00:00
0s
server : match OAI structured output response (#9527)
#150
:
Commit
8a308354f6
pushed by
root
master
2024-09-18 19:16:20 +00:00
0s
[SYCL]set context default value to avoid memory issue, update guide (#9476)
#145
:
Commit
faf67b3de4
pushed by
root
master
2024-09-18 11:06:21 +00:00
0s
arg : add env variable for parallel (#9513)
#139
:
Commit
8b836ae731
pushed by
root
master
2024-09-18 02:56:19 +00:00
0s
llama : fix n_vocab init for 'no_vocab' case (#9511)
#129
:
Commit
8344ef58f8
pushed by
root
master
2024-09-17 18:46:21 +00:00
0s
ggml : move common CPU backend impl to new header (#9509)
#110
:
Commit
23e0d70bac
pushed by
root
master
2024-09-17 10:46:18 +00:00
0s
convert : identify missing model files (#9397)
#103
:
Commit
d54c21df7e
pushed by
root
master
2024-09-16 18:26:17 +00:00
0s
common : reimplement logging (#9418)
#89
:
Commit
6262d13e0b
pushed by
root
master
2024-09-16 10:16:19 +00:00
0s
py : add "LLaMAForCausalLM" conversion support (#9485)
#82
:
Commit
3c7989fd29
pushed by
root
master
2024-09-15 17:56:18 +00:00
0s
ggml : ggml_type_name return "NONE" for invalid values (#9458)
#71
:
Commit
822b6322de
pushed by
root
master
2024-09-15 09:46:17 +00:00
0s
cmake : use list(APPEND ...) instead of set() + dedup linker (#9463)
#66
:
Commit
1f4111e540
pushed by
root
master
2024-09-14 17:26:17 +00:00
0s
server : add loading html page while model is loading (#9468)
#57
:
Commit
feff4aa846
pushed by
root
master
2024-09-14 09:16:17 +00:00
0s
llama : llama_perf + option to disable timings during decode (#9355)
#50
:
Commit
0abc6a2c25
pushed by
root
master
2024-09-13 16:56:18 +00:00
0s
server : Add option to return token pieces in /tokenize endpoint (#9108)
#42
:
Commit
78203641fe
pushed by
root
master
2024-09-13 08:46:18 +00:00
0s
cann: Add host buffer type for Ascend NPU (#9406)
#31
:
Commit
e6b7801bd1
pushed by
root
master
2024-09-13 00:36:16 +00:00
0s
cann: Fix error when running a non-exist op (#9424)
#24
:
Commit
df4b7945ae
pushed by
root
master
2024-09-12 16:26:18 +00:00
0s
llama : skip token bounds check when evaluating embeddings (#9437)
#4
:
Commit
1b28061400
pushed by
root
master
2024-09-12 08:16:17 +00:00
0s