Georgi Gerganov
|
569ebf11cf
|
server : refactor ctx_sampling init + n_ctx + names
|
2023-10-22 16:57:05 +03:00 |
|
Georgi Gerganov
|
ef18f4d579
|
server : fix crash in Debug on macOS (I have no idea why this fixes it!?)
|
2023-10-22 16:55:40 +03:00 |
|
Georgi Gerganov
|
197a0a9e23
|
server : fix switch fallthrough
|
2023-10-22 16:55:05 +03:00 |
|
Georgi Gerganov
|
176993c871
|
Merge branch 'master' into server-rev
|
2023-10-22 15:04:16 +03:00 |
|
FSSRepo
|
2eb4c11ec5
|
fix image load + view image in chat
|
2023-10-21 14:34:19 -04:00 |
|
Jhen-Jie Hong
|
17b23eb9cb
|
server : fix multibyte handle in partial response (#3706)
|
2023-10-21 14:58:03 +03:00 |
|
Georgi Gerganov
|
d1031cf49c
|
sampling : refactor init to use llama_sampling_params (#3696)
* sampling : refactor init to use llama_sampling_params
* llama : combine repetition, frequency and presence penalties in 1 call
* examples : remove embd-input and gptneox-wip
* sampling : rename penalty params + reduce size of "prev" vector
* sampling : add llama_sampling_print helper
* sampling : hide prev behind API and apply #3661
ggml-ci
|
2023-10-20 21:07:23 +03:00 |
|
Georgi Gerganov
|
778c070d1b
|
server : logs + minor code style
|
2023-10-20 20:44:51 +03:00 |
|
Georgi Gerganov
|
5d540e80d1
|
server : no need for atomic int - already using mutex
|
2023-10-20 20:44:29 +03:00 |
|
Georgi Gerganov
|
113dd60005
|
server : bach has to be allocated for n_parallel sequences
|
2023-10-20 20:42:45 +03:00 |
|
FSSRepo
|
6b2437e32d
|
added thread safe pipeline
|
2023-10-20 12:07:32 -04:00 |
|
Georgi Gerganov
|
a0edf73bda
|
server : fix uninitialized sampling context (close #3685)
|
2023-10-20 13:06:10 +03:00 |
|
Georgi Gerganov
|
325d1793f7
|
server : minor sync
|
2023-10-19 15:03:24 +03:00 |
|
Georgi Gerganov
|
9740824ba5
|
server : snake case
|
2023-10-19 14:44:37 +03:00 |
|
Georgi Gerganov
|
e3a2c3fe32
|
server : use refs + use llama_batch_clear()
|
2023-10-19 14:44:04 +03:00 |
|
Georgi Gerganov
|
3d5929e8ee
|
server : bug fix in ingest_images
n_tokens is incremented internally by llama_batch_add
|
2023-10-19 14:43:19 +03:00 |
|
Georgi Gerganov
|
a8c981b734
|
server : remove beam-search functionality
|
2023-10-19 14:10:37 +03:00 |
|
Georgi Gerganov
|
654e0a1fe0
|
server : coding-style normalization (part 2)
|
2023-10-19 14:09:45 +03:00 |
|
Georgi Gerganov
|
e44ed60187
|
server : coding-style normalization
|
2023-10-19 13:50:23 +03:00 |
|
FSSRepo
|
ab2fc00224
|
latest changes of sampling API
|
2023-10-18 16:57:48 -04:00 |
|
FSSRepo
|
8540568c48
|
Merge branch 'master' of https://github.com/ggerganov/llama.cpp
|
2023-10-18 16:55:26 -04:00 |
|
FSSRepo
|
7196c4e08a
|
new sampling API
|
2023-10-18 16:50:09 -04:00 |
|
Georgi Gerganov
|
0e89203b51
|
speculative : add tree-based sampling example (#3624)
* sampling : one sequence per sampling context
ggml-ci
* speculative : add tree-based sampling support
ggml-ci
* speculative : reuse the n_parallel CLI param
* speculative : refactor sampling
* examples : fix build after sampling refactoring
ggml-ci
* batched : fix n_seq_id
* sampling : fix malloc
ggml-ci
* swift : fix build
ggml-ci
* swift : try to fix build
ggml-ci
* prompts : add assistant.txt
* common : add llama_batch_add() and llama_batch_clear() helpers
* speculative : minor refactor
ggml-ci
* minor : comments + rename
ggml-ci
* speculative : fix off-by-one for n_drafted
* speculative : fix the n_drafted fix + p constants
|
2023-10-18 16:21:57 +03:00 |
|
FSSRepo
|
c02c52efb5
|
fix multiple clients
|
2023-10-17 17:54:56 -04:00 |
|
FSSRepo
|
d2b1fac6c7
|
fix make bui;d errors
|
2023-10-17 17:18:56 -04:00 |
|
FSSRepo
|
ed0c11cb83
|
multimodal support enabled by default
|
2023-10-17 16:58:20 -04:00 |
|
FSSRepo
|
6c277eaab5
|
update api like OpenAI
|
2023-10-17 16:53:38 -04:00 |
|
FSSRepo
|
4d1804330e
|
fix llava implementation
|
2023-10-16 16:31:17 -04:00 |
|
FSSRepo
|
d7eca255d7
|
context shift fixed
|
2023-10-16 14:43:10 -04:00 |
|
FSSRepo
|
2d9f11db28
|
fixed premature end due stop word
|
2023-10-16 12:36:05 -04:00 |
|
FSSRepo
|
fd64f04fc2
|
fix long prompt than ctx proposed in #3639
|
2023-10-15 19:07:18 -04:00 |
|
FSSRepo
|
b727e022d6
|
fix ci make build undefined ref errors
|
2023-10-15 18:53:48 -04:00 |
|
FSSRepo
|
ce961a304b
|
some ci fixes
|
2023-10-15 18:46:01 -04:00 |
|
Steward Garcia
|
9035978aae
|
Merge pull request #6 from damian0815/fssrepo_mac_fixes
fix compilation errors with llvm
|
2023-10-15 18:38:52 -04:00 |
|
FSSRepo
|
4e5c5c451c
|
notify the user from server ui that multimodality is unavialable
|
2023-10-14 08:28:49 -04:00 |
|
Damian Stewart
|
299f6b54d8
|
fix compilation errors with llvm
|
2023-10-14 11:17:38 +02:00 |
|
FSSRepo
|
7e64bfe060
|
refactor code + remove unused comments + improved README.md
|
2023-10-14 00:31:34 -04:00 |
|
FSSRepo
|
9f72b44635
|
add multimodal input - alfa
|
2023-10-13 23:36:32 -04:00 |
|
FSSRepo
|
de35b47908
|
fixed tokens probs
|
2023-10-13 19:55:25 -04:00 |
|
FSSRepo
|
9d98cdda2c
|
llava multimodal integration
|
2023-10-13 18:42:44 -04:00 |
|
FSSRepo
|
eb08201227
|
add changes to README.md
|
2023-10-13 14:28:06 -04:00 |
|
FSSRepo
|
a2c2d98c16
|
add context swap
|
2023-10-13 14:12:50 -04:00 |
|
FSSRepo
|
b6d9e212e5
|
fixed timings per slot
|
2023-10-13 13:10:38 -04:00 |
|
FSSRepo
|
6358ae5f48
|
server ui now support multiple clients
|
2023-10-13 12:22:54 -04:00 |
|
FSSRepo
|
4ba5a5013d
|
chat.mjs support cached prompt + some fixes
|
2023-10-13 11:06:41 -04:00 |
|
FSSRepo
|
500ac7120e
|
cached prompt support
|
2023-10-12 21:16:12 -04:00 |
|
FSSRepo
|
83c2b3553a
|
grammar + no stream completion
|
2023-10-12 18:43:57 -04:00 |
|
FSSRepo
|
5b8e29de53
|
multiple client support
|
2023-10-12 17:09:12 -04:00 |
|
FSSRepo
|
81484805f0
|
completion endpoint working
|
2023-10-12 16:17:27 -04:00 |
|
FSSRepo
|
29c8cdd65d
|
refactored sampling function
|
2023-10-12 15:02:19 -04:00 |
|