From 4fd985af9144676dc711429372845d64a2cd4b61 Mon Sep 17 00:00:00 2001 From: johannes Date: Mon, 9 Dec 2024 23:55:36 +0100 Subject: [PATCH] server: Update README to include standby-timeout --- examples/server/README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/examples/server/README.md b/examples/server/README.md index 117c52d3f..7f32d0d37 100644 --- a/examples/server/README.md +++ b/examples/server/README.md @@ -155,6 +155,7 @@ The project is under active development, and we are [looking for feedback and co | `-to, --timeout N` | server read/write timeout in seconds (default: 600)
(env: LLAMA_ARG_TIMEOUT) | | `--threads-http N` | number of threads used to process HTTP requests (default: -1)
(env: LLAMA_ARG_THREADS_HTTP) | | `--cache-reuse N` | min chunk size to attempt reusing from the cache via KV shifting (default: 0)
(env: LLAMA_ARG_CACHE_REUSE) | +| `--standby-timeout N` | seconds that must pass since a request has been served, before the server stops automatically (default: 0)
(env: LLAMA_ARG_STANDBY_TIMEOUT) | | `--metrics` | enable prometheus compatible metrics endpoint (default: disabled)
(env: LLAMA_ARG_ENDPOINT_METRICS) | | `--slots` | enable slots monitoring endpoint (default: disabled)
(env: LLAMA_ARG_ENDPOINT_SLOTS) | | `--props` | enable changing global properties via POST /props (default: disabled)
(env: LLAMA_ARG_ENDPOINT_PROPS) |