• Joined on 2024-09-10
root synced commits to refs/pull/9659/head at root/llama.cpp from mirror 2024-12-25 02:44:37 +00:00
d4051c81ee profiler: initial support for profiling graph ops
9ba399dfa7 server : add support for "encoding_format": "base64" to the */embeddings endpoints (#10967)
2cd43f4900 ggml : more perfo with llamafile tinyblas on x86_64 (#10714)
09fe2e7613 server: allow filtering llama server response fields (#10940)
30caac3a68 llama : the WPM vocabs use the CLS token as BOS (#10930)
Compare 475 commits »
root synced and deleted reference refs/tags/refs/pull/10535/merge at root/llama.cpp from mirror 2024-12-25 02:44:36 +00:00
root synced and deleted reference refs/tags/refs/pull/10967/merge at root/llama.cpp from mirror 2024-12-25 02:44:36 +00:00
root synced commits to graph-profiler at root/llama.cpp from mirror 2024-12-25 02:44:36 +00:00
d4051c81ee profiler: initial support for profiling graph ops
9ba399dfa7 server : add support for "encoding_format": "base64" to the */embeddings endpoints (#10967)
2cd43f4900 ggml : more perfo with llamafile tinyblas on x86_64 (#10714)
09fe2e7613 server: allow filtering llama server response fields (#10940)
30caac3a68 llama : the WPM vocabs use the CLS token as BOS (#10930)
Compare 475 commits »
root synced commits to master at root/llama.cpp from mirror 2024-12-25 02:44:36 +00:00
9ba399dfa7 server : add support for "encoding_format": "base64" to the */embeddings endpoints (#10967)
root synced commits to refs/pull/10220/merge at root/llama.cpp from mirror 2024-12-25 02:44:36 +00:00
9ba399dfa7 server : add support for "encoding_format": "base64" to the */embeddings endpoints (#10967)
2cd43f4900 ggml : more perfo with llamafile tinyblas on x86_64 (#10714)
09fe2e7613 server: allow filtering llama server response fields (#10940)
30caac3a68 llama : the WPM vocabs use the CLS token as BOS (#10930)
Compare 58 commits »
root synced commits to refs/pull/10448/merge at root/llama.cpp from mirror 2024-12-25 02:44:36 +00:00
9ba399dfa7 server : add support for "encoding_format": "base64" to the */embeddings endpoints (#10967)
2cd43f4900 ggml : more perfo with llamafile tinyblas on x86_64 (#10714)
09fe2e7613 server: allow filtering llama server response fields (#10940)
Compare 4 commits »
root synced commits to refs/pull/10535/head at root/llama.cpp from mirror 2024-12-25 02:44:36 +00:00
9ba399dfa7 server : add support for "encoding_format": "base64" to the */embeddings endpoints (#10967)
2cd43f4900 ggml : more perfo with llamafile tinyblas on x86_64 (#10714)
09fe2e7613 server: allow filtering llama server response fields (#10940)
30caac3a68 llama : the WPM vocabs use the CLS token as BOS (#10930)
60cfa728e2 ggml : use wstring for backend search paths (#10960)
Compare 200 commits »
root synced commits to refs/pull/10573/merge at root/llama.cpp from mirror 2024-12-25 02:44:36 +00:00
9ba399dfa7 server : add support for "encoding_format": "base64" to the */embeddings endpoints (#10967)
2cd43f4900 ggml : more perfo with llamafile tinyblas on x86_64 (#10714)
09fe2e7613 server: allow filtering llama server response fields (#10940)
30caac3a68 llama : the WPM vocabs use the CLS token as BOS (#10930)
Compare 7 commits »
root synced commits to refs/pull/10663/merge at root/llama.cpp from mirror 2024-12-25 02:44:36 +00:00
9ba399dfa7 server : add support for "encoding_format": "base64" to the */embeddings endpoints (#10967)
2cd43f4900 ggml : more perfo with llamafile tinyblas on x86_64 (#10714)
Compare 3 commits »
root synced commits to refs/pull/10742/merge at root/llama.cpp from mirror 2024-12-25 02:44:36 +00:00
2cd43f4900 ggml : more perfo with llamafile tinyblas on x86_64 (#10714)
09fe2e7613 server: allow filtering llama server response fields (#10940)
Compare 3 commits »
root synced commits to refs/pull/10851/merge at root/llama.cpp from mirror 2024-12-24 18:34:37 +00:00
2cd43f4900 ggml : more perfo with llamafile tinyblas on x86_64 (#10714)
09fe2e7613 server: allow filtering llama server response fields (#10940)
Compare 3 commits »
root synced commits to refs/pull/10873/merge at root/llama.cpp from mirror 2024-12-24 18:34:37 +00:00
2cd43f4900 ggml : more perfo with llamafile tinyblas on x86_64 (#10714)
09fe2e7613 server: allow filtering llama server response fields (#10940)
Compare 3 commits »
root synced commits to refs/pull/10894/merge at root/llama.cpp from mirror 2024-12-24 18:34:37 +00:00
2cd43f4900 ggml : more perfo with llamafile tinyblas on x86_64 (#10714)
09fe2e7613 server: allow filtering llama server response fields (#10940)
Compare 3 commits »
root synced commits to refs/pull/10900/merge at root/llama.cpp from mirror 2024-12-24 18:34:37 +00:00
09fe2e7613 server: allow filtering llama server response fields (#10940)
Compare 2 commits »
root synced commits to refs/pull/10902/merge at root/llama.cpp from mirror 2024-12-24 18:34:37 +00:00
09fe2e7613 server: allow filtering llama server response fields (#10940)
Compare 2 commits »
root synced commits to refs/pull/10912/merge at root/llama.cpp from mirror 2024-12-24 18:34:37 +00:00
2cd43f4900 ggml : more perfo with llamafile tinyblas on x86_64 (#10714)
09fe2e7613 server: allow filtering llama server response fields (#10940)
Compare 3 commits »
root synced commits to refs/pull/10928/merge at root/llama.cpp from mirror 2024-12-24 18:34:37 +00:00
2cd43f4900 ggml : more perfo with llamafile tinyblas on x86_64 (#10714)
09fe2e7613 server: allow filtering llama server response fields (#10940)
Compare 3 commits »
root synced commits to refs/pull/10940/head at root/llama.cpp from mirror 2024-12-24 18:34:37 +00:00
b8679c0bb5 change to "response_fields"
4cf1fef320 clarify docs
Compare 2 commits »
root synced commits to refs/pull/10942/merge at root/llama.cpp from mirror 2024-12-24 18:34:37 +00:00
2cd43f4900 ggml : more perfo with llamafile tinyblas on x86_64 (#10714)
09fe2e7613 server: allow filtering llama server response fields (#10940)
Compare 3 commits »