Commit Graph

9 Commits

Author SHA1 Message Date
Xuan Son Nguyen
caa106d4e0
Server: format error to json (#5961)
* server: format error to json

* server: do not crash on grammar error

* fix api key test case

* revert limit max n_predict

* small fix

* correct coding style

* update completion.js

* launch_slot_with_task

* update docs

* update_slots

* update webui

* update readme
2024-03-11 10:56:41 +01:00
Alexey Parfenov
213d1439fa
server : remove model.json endpoint (#5371) 2024-02-06 20:08:38 +02:00
Justin Parker
f2eb19bd8b
server : throw an error when slot unavailable (#4741) 2024-01-03 10:43:19 +02:00
ShadovvBeast
88ae8952b6
server : add optional API Key Authentication example (#4441)
* Add API key authentication for enhanced server-client security

* server : to snake_case

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-12-15 13:49:01 +02:00
Richard Kiss
9494d7c477
english : use typos to fix comments and logs (#4354) 2023-12-12 11:53:36 +02:00
SoftwareRenderer
936c79b227
server : relay error messages (#4131) 2023-11-19 18:54:10 +02:00
Stephen Nichols
5f631c2679
Fixing race condition in server and partial stream handling in frontend. (#2391)
* Fixing race condition in server.cpp and partial stream handling in completion.js

* Reverting assert edits.

* Adding newline to eof
2023-08-04 13:37:24 +02:00
Tobias Lütke
31cfbb1013
Expose generation timings from server & update completions.js (#2116)
* use javascript generators as much cleaner API

Also add ways to access completion as promise and EventSource

* export llama_timings as struct and expose them in server

* update readme, update baked includes

* llama : uniform variable names + struct init

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-07-05 16:51:13 -04:00
Tobias Lütke
7ee76e45af
Simple webchat for server (#1998)
* expose simple web interface on root domain

* embed index and add --path for choosing static dir

* allow server to multithread

because web browsers send a lot of garbage requests we want the server
to multithread when serving 404s for favicon's etc. To avoid blowing up
llama we just take a mutex when it's invoked.


* let's try this with the xxd tool instead and see if msvc is happier with that

* enable server in Makefiles

* add /completion.js file to make it easy to use the server from js

* slightly nicer css

* rework state management into session, expose historyTemplate to settings

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-07-04 16:05:27 +02:00