mirror of
https://github.com/ggerganov/llama.cpp.git
synced 2025-01-08 09:41:45 +00:00
llamafile : extend sgemm.cpp support for Q5_0 models (#10010)
nix-ci-aarch64.yml #757:Scheduled
server : refactor slot input data, move tokenizer to HTTP thread (#10023)
nix-ci-aarch64.yml #743:Scheduled