server : update readme

ggml-ci
2024-12-26 11:24:35 +00:00 · 2024-12-16 21:05:19 +02:00 · 2024-12-16 21:05:19 +02:00 · d58f8a1b6b
commit d58f8a1b6b
parent 79a8176883
1 changed files with 4 additions and 2 deletions
--- a/examples/server/README.md
+++ b/examples/server/README.md
@ -444,13 +444,14 @@ These words will not be included in the completion, so make sure to add them to
 **Response format**
- Note: In streaming mode (`stream`), only `content` and `stop` will be returned until end of completion. Responses are sent using the [Server-sent events](https://html.spec.whatwg.org/multipage/server-sent-events.html) standard. Note: the browser's `EventSource` interface cannot be used due to its lack of `POST` request support.
+- Note: In streaming mode (`stream`), only `content`, `tokens` and `stop` will be returned until end of completion. Responses are sent using the [Server-sent events](https://html.spec.whatwg.org/multipage/server-sent-events.html) standard. Note: the browser's `EventSource` interface cannot be used due to its lack of `POST` request support.
 - `completion_probabilities`: An array of token probabilities for each completion. The array's length is `n_predict`. Each item in the array has the following structure:
 ```json
 {
-  "content": "<the token selected by the model>",
+  "content": "<the token generated by the model>",
  "tokens": [ generated token ids ],
  "probs": [
    {
      "prob": float,
@ -468,6 +469,7 @@ These words will not be included in the completion, so make sure to add them to
 Notice that each `probs` is an array of length `n_probs`.
 - `content`: Completion result as a string (excluding `stopping_word` if any). In case of streaming mode, will contain the next token as a string.
 - `tokens`: Same as `content` but represented as raw token ids.
 - `stop`: Boolean for use with `stream` to check whether the generation has stopped (Note: This is not related to stopping words array `stop` from input options)
 - `generation_settings`: The provided options above excluding `prompt` but including `n_ctx`, `model`. These options may differ from the original ones in some way (e.g. bad values filtered out, strings converted to tokens, etc.).
 - `model`: The path to the model loaded with `-m`