llama.cpp/common
haopeng 64ed2091b2
Some checks failed
flake8 Lint / Lint (push) Has been cancelled
Python Type-Check / pyright type-check (push) Has been cancelled
server: Add "tokens per second" information in the backend (#10548)
* add cmake rvv support

* add timings

* remove space

* update readme

* fix

* fix code

* remove empty line

* add test

---------

Co-authored-by: Xuan Son Nguyen <son@huggingface.co>
2024-12-02 14:45:54 +01:00
..
cmake llama : reorganize source code + improve CMake (#8006) 2024-06-26 18:33:02 +03:00
arg.cpp common: fix warning message when no GPU found (#10564) 2024-11-28 18:15:25 +01:00
arg.h common : use common_ prefix for common library functions (#9805) 2024-10-10 22:57:42 +02:00
base64.hpp llava : expose as a shared library for downstream projects (#3613) 2023-11-07 00:36:23 +03:00
build-info.cpp.in build : link against build info instead of compiling against it (#3879) 2023-11-02 08:50:16 +02:00
CMakeLists.txt ggml : move AMX to the CPU backend (#10570) 2024-11-29 21:54:58 +01:00
common.cpp ggml : move AMX to the CPU backend (#10570) 2024-11-29 21:54:58 +01:00
common.h server: Add "tokens per second" information in the backend (#10548) 2024-12-02 14:45:54 +01:00
console.cpp console : utf-8 fix for windows stdin (#9690) 2024-09-30 11:23:42 +03:00
console.h gguf : new file format with flexible meta data (beta) (#2398) 2023-08-21 23:07:43 +03:00
json-schema-to-grammar.cpp grammar : fix JSON Schema for string regex with top-level alt. (#9903) 2024-10-16 19:03:24 +03:00
json-schema-to-grammar.h JSON: [key] -> .at(key), assert() -> GGML_ASSERT (#7143) 2024-05-08 21:53:08 +02:00
json.hpp json-schema-to-grammar improvements (+ added to server) (#5978) 2024-03-21 11:50:43 +00:00
log.cpp common : use common_ prefix for common library functions (#9805) 2024-10-10 22:57:42 +02:00
log.h common : use common_ prefix for common library functions (#9805) 2024-10-10 22:57:42 +02:00
ngram-cache.cpp common : use common_ prefix for common library functions (#9805) 2024-10-10 22:57:42 +02:00
ngram-cache.h common : use common_ prefix for common library functions (#9805) 2024-10-10 22:57:42 +02:00
sampling.cpp speculative : refactor and add a simpler example (#10362) 2024-11-25 09:58:41 +02:00
sampling.h speculative : refactor and add a simpler example (#10362) 2024-11-25 09:58:41 +02:00
speculative.cpp server : add more information about error (#10455) 2024-11-25 22:28:59 +02:00
speculative.h speculative : refactor and add a simpler example (#10362) 2024-11-25 09:58:41 +02:00
stb_image.h common : Update stb_image.h to latest version (#9161) 2024-08-27 08:58:50 +03:00