llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2024-12-30 13:24:35 +00:00

Author	SHA1	Message	Date
M. Yusuf Sarıgöz	2679c432d5	Update readme to document multimodal in server	2023-10-22 19:49:33 +03:00
Georgi Gerganov	f305d6434f	editorconfig : new line in index.html	2023-10-22 19:10:30 +03:00
M. Yusuf Sarıgöz	5359fb9267	Do not save/load image_data to localStorage	2023-10-22 19:08:09 +03:00
Georgi Gerganov	f67d971344	server : bug fix for prompt caching	2023-10-22 17:52:59 +03:00
Georgi Gerganov	569ebf11cf	server : refactor ctx_sampling init + n_ctx + names	2023-10-22 16:57:05 +03:00
Georgi Gerganov	ef18f4d579	server : fix crash in Debug on macOS (I have no idea why this fixes it!?)	2023-10-22 16:55:40 +03:00
Georgi Gerganov	197a0a9e23	server : fix switch fallthrough	2023-10-22 16:55:05 +03:00
Georgi Gerganov	176993c871	Merge branch 'master' into server-rev	2023-10-22 15:04:16 +03:00
FSSRepo	2eb4c11ec5	fix image load + view image in chat	2023-10-21 14:34:19 -04:00
Jhen-Jie Hong	17b23eb9cb	server : fix multibyte handle in partial response (#3706 )	2023-10-21 14:58:03 +03:00
Georgi Gerganov	d1031cf49c	sampling : refactor init to use llama_sampling_params (#3696 ) * sampling : refactor init to use llama_sampling_params * llama : combine repetition, frequency and presence penalties in 1 call * examples : remove embd-input and gptneox-wip * sampling : rename penalty params + reduce size of "prev" vector * sampling : add llama_sampling_print helper * sampling : hide prev behind API and apply #3661 ggml-ci	2023-10-20 21:07:23 +03:00
Georgi Gerganov	778c070d1b	server : logs + minor code style	2023-10-20 20:44:51 +03:00
Georgi Gerganov	5d540e80d1	server : no need for atomic int - already using mutex	2023-10-20 20:44:29 +03:00
Georgi Gerganov	113dd60005	server : bach has to be allocated for n_parallel sequences	2023-10-20 20:42:45 +03:00
FSSRepo	6b2437e32d	added thread safe pipeline	2023-10-20 12:07:32 -04:00
Georgi Gerganov	a0edf73bda	server : fix uninitialized sampling context (close #3685 )	2023-10-20 13:06:10 +03:00
Georgi Gerganov	325d1793f7	server : minor sync	2023-10-19 15:03:24 +03:00
Georgi Gerganov	9740824ba5	server : snake case	2023-10-19 14:44:37 +03:00
Georgi Gerganov	e3a2c3fe32	server : use refs + use llama_batch_clear()	2023-10-19 14:44:04 +03:00
Georgi Gerganov	3d5929e8ee	server : bug fix in ingest_images n_tokens is incremented internally by llama_batch_add	2023-10-19 14:43:19 +03:00
Georgi Gerganov	a8c981b734	server : remove beam-search functionality	2023-10-19 14:10:37 +03:00
Georgi Gerganov	654e0a1fe0	server : coding-style normalization (part 2)	2023-10-19 14:09:45 +03:00
Georgi Gerganov	e44ed60187	server : coding-style normalization	2023-10-19 13:50:23 +03:00
FSSRepo	ab2fc00224	latest changes of sampling API	2023-10-18 16:57:48 -04:00
FSSRepo	8540568c48	Merge branch 'master' of https://github.com/ggerganov/llama.cpp	2023-10-18 16:55:26 -04:00
FSSRepo	7196c4e08a	new sampling API	2023-10-18 16:50:09 -04:00
Georgi Gerganov	0e89203b51	speculative : add tree-based sampling example (#3624 ) * sampling : one sequence per sampling context ggml-ci * speculative : add tree-based sampling support ggml-ci * speculative : reuse the n_parallel CLI param * speculative : refactor sampling * examples : fix build after sampling refactoring ggml-ci * batched : fix n_seq_id * sampling : fix malloc ggml-ci * swift : fix build ggml-ci * swift : try to fix build ggml-ci * prompts : add assistant.txt * common : add llama_batch_add() and llama_batch_clear() helpers * speculative : minor refactor ggml-ci * minor : comments + rename ggml-ci * speculative : fix off-by-one for n_drafted * speculative : fix the n_drafted fix + p constants	2023-10-18 16:21:57 +03:00
FSSRepo	c02c52efb5	fix multiple clients	2023-10-17 17:54:56 -04:00
FSSRepo	d2b1fac6c7	fix make bui;d errors	2023-10-17 17:18:56 -04:00
FSSRepo	ed0c11cb83	multimodal support enabled by default	2023-10-17 16:58:20 -04:00
FSSRepo	6c277eaab5	update api like OpenAI	2023-10-17 16:53:38 -04:00
FSSRepo	58f8ae9bfe	readme change	2023-10-17 16:32:19 -04:00
FSSRepo	fa0f22f14f	Merge remote-tracking branch 'upstream/master'	2023-10-17 16:31:33 -04:00
FSSRepo	aa2268f4cd	sync README.md changes	2023-10-17 16:21:05 -04:00
Georgi Gerganov	e74c705e15	editorconfig : remove trailing spaces	2023-10-17 19:52:53 +03:00
coezbek	3ad1e3f1a1	server : documentation of JSON return value of /completion endpoint (#3632 ) * Added documentation of JSON return value of /completion endpoint * Update examples/server/README.md --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-10-17 19:51:02 +03:00
FSSRepo	4d1804330e	fix llava implementation	2023-10-16 16:31:17 -04:00
FSSRepo	d7eca255d7	context shift fixed	2023-10-16 14:43:10 -04:00
FSSRepo	2d9f11db28	fixed premature end due stop word	2023-10-16 12:36:05 -04:00
FSSRepo	fd64f04fc2	fix long prompt than ctx proposed in #3639	2023-10-15 19:07:18 -04:00
FSSRepo	b727e022d6	fix ci make build undefined ref errors	2023-10-15 18:53:48 -04:00
FSSRepo	ce961a304b	some ci fixes	2023-10-15 18:46:01 -04:00
Steward Garcia	9035978aae	Merge pull request #6 from damian0815/fssrepo_mac_fixes fix compilation errors with llvm	2023-10-15 18:38:52 -04:00
FSSRepo	4e5c5c451c	notify the user from server ui that multimodality is unavialable	2023-10-14 08:28:49 -04:00
Damian Stewart	299f6b54d8	fix compilation errors with llvm	2023-10-14 11:17:38 +02:00
FSSRepo	7e64bfe060	refactor code + remove unused comments + improved README.md	2023-10-14 00:31:34 -04:00
FSSRepo	9f72b44635	add multimodal input - alfa	2023-10-13 23:36:32 -04:00
FSSRepo	de35b47908	fixed tokens probs	2023-10-13 19:55:25 -04:00
FSSRepo	9d98cdda2c	llava multimodal integration	2023-10-13 18:42:44 -04:00
FSSRepo	eb08201227	add changes to README.md	2023-10-13 14:28:06 -04:00

1 2 3

146 Commits