llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2024-12-27 20:04:35 +00:00

History

Georgi Gerganov 0e70ba686e server : add "tokens" output (#10853 ) * server : add "tokens" output ggml-ci * server : update readme ggml-ci * server : return tokens ids only if requested ggml-ci * tests : improve "tokens" type check Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com> * server : remove "tokens" from the OAI endpoint ggml-ci --------- Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>		2024-12-18 11:05:29 +02:00
..
test_basic.py	server : add flag to disable the web-ui (#10762 ) (#10751 )	2024-12-10 18:22:34 +01:00
test_chat_completion.py	server : (refactor) no more json in server_task input (#10691 )	2024-12-07 20:21:09 +01:00
test_completion.py	server : add "tokens" output (#10853 )	2024-12-18 11:05:29 +02:00
test_ctx_shift.py	server : replace behave with pytest (#10416 )	2024-11-26 16:20:18 +01:00
test_embedding.py	server : (embeddings) using same format for "input" and "content" (#10872 )	2024-12-18 10:55:09 +02:00
test_infill.py	server : fix format_infill (#10724 )	2024-12-08 23:04:29 +01:00
test_lora.py	server : replace behave with pytest (#10416 )	2024-11-26 16:20:18 +01:00
test_rerank.py	server : fill usage info in embeddings and rerank responses (#10852 )	2024-12-17 18:00:24 +02:00
test_security.py	server : replace behave with pytest (#10416 )	2024-11-26 16:20:18 +01:00
test_slot_save.py	server : replace behave with pytest (#10416 )	2024-11-26 16:20:18 +01:00
test_speculative.py	server : fix speculative decoding with context shift (#10641 )	2024-12-04 22:38:20 +02:00
test_tokenize.py	server : replace behave with pytest (#10416 )	2024-11-26 16:20:18 +01:00