llama.cpp/examples/server/tests/unit
Georgi Gerganov 152610eda9
server : output embeddings for all tokens when pooling = none (#10861)
* server : add "tokens" output

ggml-ci

* server : output embeddings for all tokens when pooling = none

ggml-ci

* server : update readme [no ci]

* server : fix spacing [no ci]

Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>

* server : be explicit about the pooling type in the tests

ggml-ci

* server : update /embeddings and /v1/embeddings endpoints

ggml-ci

* server : do not normalize embeddings when there is no pooling

ggml-ci

* server : update readme

ggml-ci

* server : fixes

* tests : update server tests

ggml-ci

* server : update readme [no ci]

* server : remove rebase artifact

---------

Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
2024-12-18 13:01:41 +02:00
..
test_basic.py server : add flag to disable the web-ui (#10762) (#10751) 2024-12-10 18:22:34 +01:00
test_chat_completion.py server : (refactor) no more json in server_task input (#10691) 2024-12-07 20:21:09 +01:00
test_completion.py server : add "tokens" output (#10853) 2024-12-18 11:05:29 +02:00
test_ctx_shift.py server : replace behave with pytest (#10416) 2024-11-26 16:20:18 +01:00
test_embedding.py server : output embeddings for all tokens when pooling = none (#10861) 2024-12-18 13:01:41 +02:00
test_infill.py server : fix format_infill (#10724) 2024-12-08 23:04:29 +01:00
test_lora.py server : replace behave with pytest (#10416) 2024-11-26 16:20:18 +01:00
test_rerank.py server : fill usage info in embeddings and rerank responses (#10852) 2024-12-17 18:00:24 +02:00
test_security.py server : replace behave with pytest (#10416) 2024-11-26 16:20:18 +01:00
test_slot_save.py server : replace behave with pytest (#10416) 2024-11-26 16:20:18 +01:00
test_speculative.py server : fix speculative decoding with context shift (#10641) 2024-12-04 22:38:20 +02:00
test_tokenize.py server : replace behave with pytest (#10416) 2024-11-26 16:20:18 +01:00