FSSRepo
7196c4e08a
new sampling API
2023-10-18 16:50:09 -04:00
FSSRepo
c02c52efb5
fix multiple clients
2023-10-17 17:54:56 -04:00
FSSRepo
d2b1fac6c7
fix make bui;d errors
2023-10-17 17:18:56 -04:00
FSSRepo
ed0c11cb83
multimodal support enabled by default
2023-10-17 16:58:20 -04:00
FSSRepo
6c277eaab5
update api like OpenAI
2023-10-17 16:53:38 -04:00
FSSRepo
58f8ae9bfe
readme change
2023-10-17 16:32:19 -04:00
FSSRepo
fa0f22f14f
Merge remote-tracking branch 'upstream/master'
2023-10-17 16:31:33 -04:00
FSSRepo
aa2268f4cd
sync README.md changes
2023-10-17 16:21:05 -04:00
Georgi Gerganov
e74c705e15
editorconfig : remove trailing spaces
2023-10-17 19:52:53 +03:00
coezbek
3ad1e3f1a1
server : documentation of JSON return value of /completion endpoint ( #3632 )
...
* Added documentation of JSON return value of /completion endpoint
* Update examples/server/README.md
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-10-17 19:51:02 +03:00
FSSRepo
4d1804330e
fix llava implementation
2023-10-16 16:31:17 -04:00
FSSRepo
d7eca255d7
context shift fixed
2023-10-16 14:43:10 -04:00
FSSRepo
2d9f11db28
fixed premature end due stop word
2023-10-16 12:36:05 -04:00
FSSRepo
fd64f04fc2
fix long prompt than ctx proposed in #3639
2023-10-15 19:07:18 -04:00
FSSRepo
b727e022d6
fix ci make build undefined ref errors
2023-10-15 18:53:48 -04:00
FSSRepo
ce961a304b
some ci fixes
2023-10-15 18:46:01 -04:00
Steward Garcia
9035978aae
Merge pull request #6 from damian0815/fssrepo_mac_fixes
...
fix compilation errors with llvm
2023-10-15 18:38:52 -04:00
FSSRepo
4e5c5c451c
notify the user from server ui that multimodality is unavialable
2023-10-14 08:28:49 -04:00
Damian Stewart
299f6b54d8
fix compilation errors with llvm
2023-10-14 11:17:38 +02:00
FSSRepo
7e64bfe060
refactor code + remove unused comments + improved README.md
2023-10-14 00:31:34 -04:00
FSSRepo
9f72b44635
add multimodal input - alfa
2023-10-13 23:36:32 -04:00
FSSRepo
de35b47908
fixed tokens probs
2023-10-13 19:55:25 -04:00
FSSRepo
9d98cdda2c
llava multimodal integration
2023-10-13 18:42:44 -04:00
FSSRepo
eb08201227
add changes to README.md
2023-10-13 14:28:06 -04:00
FSSRepo
a2c2d98c16
add context swap
2023-10-13 14:12:50 -04:00
FSSRepo
b6d9e212e5
fixed timings per slot
2023-10-13 13:10:38 -04:00
FSSRepo
a410a9e300
unused change reverted
2023-10-13 12:23:58 -04:00
FSSRepo
6358ae5f48
server ui now support multiple clients
2023-10-13 12:22:54 -04:00
FSSRepo
4ba5a5013d
chat.mjs support cached prompt + some fixes
2023-10-13 11:06:41 -04:00
FSSRepo
500ac7120e
cached prompt support
2023-10-12 21:16:12 -04:00
FSSRepo
83c2b3553a
grammar + no stream completion
2023-10-12 18:43:57 -04:00
FSSRepo
5b8e29de53
multiple client support
2023-10-12 17:09:12 -04:00
FSSRepo
81484805f0
completion endpoint working
2023-10-12 16:17:27 -04:00
FSSRepo
29c8cdd65d
refactored sampling function
2023-10-12 15:02:19 -04:00
FSSRepo
b716eeb72a
Merge branch 'master' of https://github.com/ggerganov/llama.cpp
2023-10-12 12:55:08 -04:00
FSSRepo
78504218b9
save dev progress
2023-10-12 12:51:48 -04:00
Aarni Koskela
b016596d90
server : add completion mode (no chat) ( #3582 )
2023-10-12 09:51:53 +03:00
Georgi Gerganov
57dd55e2c7
server : fix kv cache management ( #3588 )
2023-10-12 09:29:04 +03:00
FSSRepo
471230202d
crash fixed
2023-10-11 19:48:15 -04:00
FSSRepo
63f99b1ea6
implementing parallel decoding in server example
2023-10-11 18:14:11 -04:00
Michael Coppola
a8bdd65525
server : add parameter -tb N, --threads-batch N ( #3584 )
...
Co-authored-by: Michael Coppola <info@michaeljcoppola.com>
2023-10-11 22:42:22 +03:00
Kerfuffle
70c29da118
common : fix mirostat state when using multiple sequences ( #3543 )
...
* Fix mirostat state when using multiple sequences
* Fix mirostat by completely refactoring sampling!
* Try to fix zig build.
* Export function to fetch/create default sampler states
Code formatting cleanups and add some comments
Silence a warning about id not being used when logging is disabled
* Apply some renaming suggestions.
Fix comments that were out of sync with the pull.
* Use more consistant naming convention for sampling contexts
2023-10-11 22:35:46 +03:00
vvhg1
11ea5c7d96
infill. : fix tokenization ( #3508 )
...
* infill tokens correction
* serverinfill tokens correction
* removing any leading whitespace from infill suffix and removing leeading space token from suffix when params.escape
* removing any leading whitespace from infill suffix and removing leeading space token from suffix when params.escape
* only rm when params.escape, rm space if possible which is added back or rm added space token
* only rm when params.escape, rm space if possible which is added back or rm added space token
* Revert "only rm when params.escape, rm space if possible which is added back or rm added space token"
This reverts commit 63ba0b621f
.
* fix interactive prompt escaping and fix server infill leading space handling
* rm unnecessary bool check
2023-10-10 10:31:21 +03:00
Ryder Wishart
8e6716a102
api_like_OAI.py : compat with Microsoft Guidance ( #2746 )
...
Check for None in addition to empty string check in all request params
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-10-08 13:55:58 +03:00
arcrank
9c38d181d4
api_like_OAI.py : simplify function ( #2796 )
...
Simplify function
2023-10-08 13:52:57 +03:00
Mihai
cb13d73a72
server : docs fix default values and add n_probs ( #3506 )
2023-10-06 21:39:33 +03:00
Jhen-Jie Hong
97af49fa39
server : reuse llama_sample_token common util ( #3494 )
...
* server : reuse llama_sample_token common function
* common : use n_probs for temperature sampling
2023-10-06 15:44:24 +03:00
Kenvix ⭐
45eba9369f
build : use std::make_tuple() for compatibility with older GCC versions ( #3488 )
2023-10-05 20:16:39 +03:00
Jhen-Jie Hong
e8b8d32e86
server : fix incorrect num_tokens_predicted ( #3480 )
2023-10-05 17:02:55 +03:00
Georgi Gerganov
ac2219fef3
llama : fix session saving/loading ( #3400 )
...
* llama : fix session saving/loading
* llama : temp fix for clearing "future" tokens from the KV cache
* llama : fix handling of "future" tokens when loading sessions
* llama : fix comments for llama_kv_cache API
2023-10-03 21:04:01 +03:00