llama.cpp/examples/server/tests/features
Jan Boon beea6e1b16
llama : save and restore kv cache for single seq id (#6341)
* llama : save and restore kv cache for single seq id

* remove trailing whitespace

* respond error in case there's no space in the kv cache

* add kv seq save restore to test case

* add --slot-save-path arg to enable save restore and restrict save location

* Returning 0 for some cases, instead of asserting.

* cleanup error cases

* rename sequence state functions

* rename state get set functions

* add previous function names back in with DEPRECATED notice

* update doc

* adjust endpoints to preferred style

* fix restoring zero cell count

* handle seq rm return value

* unused param

* keep in the size check

* fix return types

* add server test case for slot save restore

* cleanup

* add cake

* cleanup style

* add special

* removing a whole sequence never fails

* move sequence state file functionality from server to llama to match session api and add version tags

* catch exceptions on save as well

* error log messages

* check types for stricter restore

* update server doc

* readme : update API changes date

* strict filename validation

* move include, reject bom as well

* also reject empty filename

* reject whitespace and trailing dot

---------

Co-authored-by: Martin Evans <martindevans@gmail.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-04-08 15:43:30 +03:00
..
steps llama : save and restore kv cache for single seq id (#6341) 2024-04-08 15:43:30 +03:00
embeddings.feature common: llama_load_model_from_url using --model-url (#6098) 2024-03-17 19:12:37 +01:00
environment.py server tests : more pythonic process management; fix bare except: (#6146) 2024-03-20 06:33:49 +01:00
issues.feature server: tests: passkey challenge / self-extend with context shift demo (#5832) 2024-03-02 22:00:14 +01:00
parallel.feature common: llama_load_model_from_url split support (#6192) 2024-03-23 18:07:00 +01:00
passkey.feature server: tests: passkey challenge / self-extend with context shift demo (#5832) 2024-03-02 22:00:14 +01:00
security.feature json-schema-to-grammar improvements (+ added to server) (#5978) 2024-03-21 11:50:43 +00:00
server.feature common: llama_load_model_from_url split support (#6192) 2024-03-23 18:07:00 +01:00
slotsave.feature llama : save and restore kv cache for single seq id (#6341) 2024-04-08 15:43:30 +03:00
wrong_usages.feature server: tests: passkey challenge / self-extend with context shift demo (#5832) 2024-03-02 22:00:14 +01:00