Commit Graph

146 Commits

Author SHA1 Message Date
M. Yusuf Sarıgöz
2679c432d5 Update readme to document multimodal in server 2023-10-22 19:49:33 +03:00
Georgi Gerganov
f305d6434f
editorconfig : new line in index.html 2023-10-22 19:10:30 +03:00
M. Yusuf Sarıgöz
5359fb9267 Do not save/load image_data to localStorage 2023-10-22 19:08:09 +03:00
Georgi Gerganov
f67d971344
server : bug fix for prompt caching 2023-10-22 17:52:59 +03:00
Georgi Gerganov
569ebf11cf
server : refactor ctx_sampling init + n_ctx + names 2023-10-22 16:57:05 +03:00
Georgi Gerganov
ef18f4d579
server : fix crash in Debug on macOS (I have no idea why this fixes it!?) 2023-10-22 16:55:40 +03:00
Georgi Gerganov
197a0a9e23
server : fix switch fallthrough 2023-10-22 16:55:05 +03:00
Georgi Gerganov
176993c871
Merge branch 'master' into server-rev 2023-10-22 15:04:16 +03:00
FSSRepo
2eb4c11ec5 fix image load + view image in chat 2023-10-21 14:34:19 -04:00
Jhen-Jie Hong
17b23eb9cb
server : fix multibyte handle in partial response (#3706) 2023-10-21 14:58:03 +03:00
Georgi Gerganov
d1031cf49c
sampling : refactor init to use llama_sampling_params (#3696)
* sampling : refactor init to use llama_sampling_params

* llama : combine repetition, frequency and presence penalties in 1 call

* examples : remove embd-input and gptneox-wip

* sampling : rename penalty params + reduce size of "prev" vector

* sampling : add llama_sampling_print helper

* sampling : hide prev behind API and apply #3661

ggml-ci
2023-10-20 21:07:23 +03:00
Georgi Gerganov
778c070d1b
server : logs + minor code style 2023-10-20 20:44:51 +03:00
Georgi Gerganov
5d540e80d1
server : no need for atomic int - already using mutex 2023-10-20 20:44:29 +03:00
Georgi Gerganov
113dd60005
server : bach has to be allocated for n_parallel sequences 2023-10-20 20:42:45 +03:00
FSSRepo
6b2437e32d added thread safe pipeline 2023-10-20 12:07:32 -04:00
Georgi Gerganov
a0edf73bda
server : fix uninitialized sampling context (close #3685) 2023-10-20 13:06:10 +03:00
Georgi Gerganov
325d1793f7
server : minor sync 2023-10-19 15:03:24 +03:00
Georgi Gerganov
9740824ba5
server : snake case 2023-10-19 14:44:37 +03:00
Georgi Gerganov
e3a2c3fe32
server : use refs + use llama_batch_clear() 2023-10-19 14:44:04 +03:00
Georgi Gerganov
3d5929e8ee
server : bug fix in ingest_images
n_tokens is incremented internally by llama_batch_add
2023-10-19 14:43:19 +03:00
Georgi Gerganov
a8c981b734
server : remove beam-search functionality 2023-10-19 14:10:37 +03:00
Georgi Gerganov
654e0a1fe0
server : coding-style normalization (part 2) 2023-10-19 14:09:45 +03:00
Georgi Gerganov
e44ed60187
server : coding-style normalization 2023-10-19 13:50:23 +03:00
FSSRepo
ab2fc00224 latest changes of sampling API 2023-10-18 16:57:48 -04:00
FSSRepo
8540568c48 Merge branch 'master' of https://github.com/ggerganov/llama.cpp 2023-10-18 16:55:26 -04:00
FSSRepo
7196c4e08a new sampling API 2023-10-18 16:50:09 -04:00
Georgi Gerganov
0e89203b51
speculative : add tree-based sampling example (#3624)
* sampling : one sequence per sampling context

ggml-ci

* speculative : add tree-based sampling support

ggml-ci

* speculative : reuse the n_parallel CLI param

* speculative : refactor sampling

* examples : fix build after sampling refactoring

ggml-ci

* batched : fix n_seq_id

* sampling : fix malloc

ggml-ci

* swift : fix build

ggml-ci

* swift : try to fix build

ggml-ci

* prompts : add assistant.txt

* common : add llama_batch_add() and llama_batch_clear() helpers

* speculative : minor refactor

ggml-ci

* minor : comments + rename

ggml-ci

* speculative : fix off-by-one for n_drafted

* speculative : fix the n_drafted fix + p constants
2023-10-18 16:21:57 +03:00
FSSRepo
c02c52efb5 fix multiple clients 2023-10-17 17:54:56 -04:00
FSSRepo
d2b1fac6c7 fix make bui;d errors 2023-10-17 17:18:56 -04:00
FSSRepo
ed0c11cb83 multimodal support enabled by default 2023-10-17 16:58:20 -04:00
FSSRepo
6c277eaab5 update api like OpenAI 2023-10-17 16:53:38 -04:00
FSSRepo
58f8ae9bfe readme change 2023-10-17 16:32:19 -04:00
FSSRepo
fa0f22f14f Merge remote-tracking branch 'upstream/master' 2023-10-17 16:31:33 -04:00
FSSRepo
aa2268f4cd sync README.md changes 2023-10-17 16:21:05 -04:00
Georgi Gerganov
e74c705e15
editorconfig : remove trailing spaces 2023-10-17 19:52:53 +03:00
coezbek
3ad1e3f1a1
server : documentation of JSON return value of /completion endpoint (#3632)
* Added documentation of JSON return value of /completion endpoint

* Update examples/server/README.md

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-10-17 19:51:02 +03:00
FSSRepo
4d1804330e fix llava implementation 2023-10-16 16:31:17 -04:00
FSSRepo
d7eca255d7 context shift fixed 2023-10-16 14:43:10 -04:00
FSSRepo
2d9f11db28 fixed premature end due stop word 2023-10-16 12:36:05 -04:00
FSSRepo
fd64f04fc2 fix long prompt than ctx proposed in #3639 2023-10-15 19:07:18 -04:00
FSSRepo
b727e022d6 fix ci make build undefined ref errors 2023-10-15 18:53:48 -04:00
FSSRepo
ce961a304b some ci fixes 2023-10-15 18:46:01 -04:00
Steward Garcia
9035978aae
Merge pull request #6 from damian0815/fssrepo_mac_fixes
fix compilation errors with llvm
2023-10-15 18:38:52 -04:00
FSSRepo
4e5c5c451c notify the user from server ui that multimodality is unavialable 2023-10-14 08:28:49 -04:00
Damian Stewart
299f6b54d8 fix compilation errors with llvm 2023-10-14 11:17:38 +02:00
FSSRepo
7e64bfe060 refactor code + remove unused comments + improved README.md 2023-10-14 00:31:34 -04:00
FSSRepo
9f72b44635 add multimodal input - alfa 2023-10-13 23:36:32 -04:00
FSSRepo
de35b47908 fixed tokens probs 2023-10-13 19:55:25 -04:00
FSSRepo
9d98cdda2c llava multimodal integration 2023-10-13 18:42:44 -04:00
FSSRepo
eb08201227 add changes to README.md 2023-10-13 14:28:06 -04:00