llama.cpp/bug.md at 31f2d03f1bc1e88523d70d70a651ea095b8bd85b

mirror of https://github.com/ggerganov/llama.cpp.git synced 2024-12-30 13:24:35 +00:00

Pierrick Hymbert 525213d2f5

* server: tests: init scenarios
 - health and slots endpoints
 - completion endpoint
 - OAI compatible chat completion requests w/ and without streaming
 - completion multi users scenario
 - multi users scenario on OAI compatible endpoint with streaming
 - multi users with total number of tokens to predict exceeds the KV Cache size
 - server wrong usage scenario, like in Infinite loop of "context shift" #3969
 - slots shifting
 - continuous batching
 - embeddings endpoint
 - multi users embedding endpoint: Segmentation fault #5655
 - OpenAI-compatible embeddings API
 - tokenize endpoint
 - CORS and api key scenario

* server: CI GitHub workflow


---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

2024-02-24 12:28:55 +01:00

498 B

Raw Blame History

name

about

labels

assignees

Bug template

Used to report bugs in llama.cpp

bug-unconfirmed

Please include information about your system, the steps to reproduce the bug, and the version of llama.cpp that you are using. If possible, please provide a minimal code example that reproduces the bug.

If the bug concerns the server, please try to reproduce it first using the server test scenario framework.

498 B Raw Blame History

498 B

Raw Blame History