FSSRepo
c02c52efb5
fix multiple clients
2023-10-17 17:54:56 -04:00
FSSRepo
d2b1fac6c7
fix make bui;d errors
2023-10-17 17:18:56 -04:00
FSSRepo
ed0c11cb83
multimodal support enabled by default
2023-10-17 16:58:20 -04:00
FSSRepo
6c277eaab5
update api like OpenAI
2023-10-17 16:53:38 -04:00
FSSRepo
58f8ae9bfe
readme change
2023-10-17 16:32:19 -04:00
FSSRepo
fa0f22f14f
Merge remote-tracking branch 'upstream/master'
2023-10-17 16:31:33 -04:00
slaren
cb33f43a2a
fix embeddings when using CUDA ( #3657 )
2023-10-17 22:24:50 +02:00
FSSRepo
aa2268f4cd
sync README.md changes
2023-10-17 16:21:05 -04:00
Georgi Gerganov
e1675d133c
llama : avoid fprintf in favor of LLAMA_LOG ( #3538 )
2023-10-17 22:34:26 +03:00
BarfingLemurs
8402566a7c
readme : update hot-topics & models, detail windows release in usage ( #3615 )
...
* Update README.md
* Update README.md
* Update README.md
* move "Running on Windows" section below "Prepare data and run"
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-10-17 21:13:21 +03:00
shibe2
40e5ce054f
CLBlast: Fix temporary buffer size for f16 conversion (wsize)
...
Fix buffer overflow.
Reduce the size to fit just one 2D slice.
Assert sufficient size.
2023-10-17 21:02:30 +04:00
slaren
a5e8c1d8c7
train-text-from-scratch : fix assert failure in ggml-alloc ( #3618 )
2023-10-17 20:00:58 +03:00
Georgi Gerganov
e74c705e15
editorconfig : remove trailing spaces
2023-10-17 19:52:53 +03:00
coezbek
3ad1e3f1a1
server : documentation of JSON return value of /completion endpoint ( #3632 )
...
* Added documentation of JSON return value of /completion endpoint
* Update examples/server/README.md
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-10-17 19:51:02 +03:00
Georgi Gerganov
1142013da4
save-load-state : fix example + add ci test ( #3655 )
...
* save-load-state : fix example (close #3606 )
* ci : add test for save-load-state example
ggml-ci
2023-10-17 19:12:46 +03:00
ldwang
5fe268a4d9
readme : add Aquila2 links ( #3610 )
...
Signed-off-by: ldwang <ftgreat@gmail.com>
Co-authored-by: ldwang <ftgreat@gmail.com>
2023-10-17 18:52:33 +03:00
staviq
1a159553f9
tokenizer : special token handling ( #3538 )
...
* Rewrite special token handling from #1931
* shorten param name, add st verification by type
* use offsets instead of copy by substr
* formatting, remove copying iterator on delete
* llama : normalize code-style
* swift fix
* print pfx/sfx if verb, main: split pfx input sfx
* dont add space when using special tokens
* minor : comment + spacing
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-10-17 18:11:01 +03:00
Georgi Gerganov
281ef73c25
k-quants : fix quantization ranges ( #3646 )
2023-10-17 09:19:28 +03:00
Georgi Gerganov
940efa95fe
llava : fix tokenization to not add bos between image embeddings and user prompt ( #3645 )
...
* llava : fix tokenization to not add bos after system prompt
* set seed
---------
Co-authored-by: M. Yusuf Sarıgöz <yusufsarigoz@gmail.com>
2023-10-16 23:58:00 +03:00
FSSRepo
4d1804330e
fix llava implementation
2023-10-16 16:31:17 -04:00
FSSRepo
d7eca255d7
context shift fixed
2023-10-16 14:43:10 -04:00
FSSRepo
2d9f11db28
fixed premature end due stop word
2023-10-16 12:36:05 -04:00
FSSRepo
fd64f04fc2
fix long prompt than ctx proposed in #3639
2023-10-15 19:07:18 -04:00
FSSRepo
b727e022d6
fix ci make build undefined ref errors
2023-10-15 18:53:48 -04:00
FSSRepo
ce961a304b
some ci fixes
2023-10-15 18:46:01 -04:00
Steward Garcia
9035978aae
Merge pull request #6 from damian0815/fssrepo_mac_fixes
...
fix compilation errors with llvm
2023-10-15 18:38:52 -04:00
Steward Garcia
f47fd17b73
Merge branch 'ggerganov:master' into master
2023-10-15 18:23:47 -04:00
cebtenzzre
11bff29045
MPT : support GQA for replit-code-v1.5 ( #3627 )
2023-10-15 09:32:06 +03:00
FSSRepo
4e5c5c451c
notify the user from server ui that multimodality is unavialable
2023-10-14 08:28:49 -04:00
M. Yusuf Sarıgöz
11dc1091f6
Honor -ngl option for Cuda offloading in llava ( #3621 )
2023-10-14 04:52:44 -06:00
Damian Stewart
299f6b54d8
fix compilation errors with llvm
2023-10-14 11:17:38 +02:00
FSSRepo
7e64bfe060
refactor code + remove unused comments + improved README.md
2023-10-14 00:31:34 -04:00
FSSRepo
9f72b44635
add multimodal input - alfa
2023-10-13 23:36:32 -04:00
FSSRepo
de35b47908
fixed tokens probs
2023-10-13 19:55:25 -04:00
FSSRepo
9d98cdda2c
llava multimodal integration
2023-10-13 18:42:44 -04:00
FSSRepo
eb08201227
add changes to README.md
2023-10-13 14:28:06 -04:00
FSSRepo
a2c2d98c16
add context swap
2023-10-13 14:12:50 -04:00
FSSRepo
b6d9e212e5
fixed timings per slot
2023-10-13 13:10:38 -04:00
FSSRepo
a410a9e300
unused change reverted
2023-10-13 12:23:58 -04:00
FSSRepo
6358ae5f48
server ui now support multiple clients
2023-10-13 12:22:54 -04:00
FSSRepo
4ba5a5013d
chat.mjs support cached prompt + some fixes
2023-10-13 11:06:41 -04:00
Daniel Bevenius
2a4bcbacea
llama : remove n_threads from llama_decode_internal ( #3614 )
...
This commit removes `n_threads` from the `llama_decode_internal`
functions doc comment as it does not exist anymore.
It looks like this parameter was removed in
Commit 16bc66d947
("llama.cpp : split
llama_context_params into model and context params").
Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>
2023-10-13 13:33:16 +03:00
slaren
424b6381c4
ggml : add context enumeration functions ( #3605 )
...
finetune : fix assert failure in ggml-alloc
2023-10-13 12:23:10 +02:00
FSSRepo
500ac7120e
cached prompt support
2023-10-12 21:16:12 -04:00
FSSRepo
83c2b3553a
grammar + no stream completion
2023-10-12 18:43:57 -04:00
FSSRepo
5b8e29de53
multiple client support
2023-10-12 17:09:12 -04:00
FSSRepo
81484805f0
completion endpoint working
2023-10-12 16:17:27 -04:00
shibe2
1e0e873c37
CLBlast: Fix matrix-vector multiplication ( #3544 )
2023-10-12 21:59:47 +02:00
FSSRepo
29c8cdd65d
refactored sampling function
2023-10-12 15:02:19 -04:00
FSSRepo
b716eeb72a
Merge branch 'master' of https://github.com/ggerganov/llama.cpp
2023-10-12 12:55:08 -04:00