llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2024-12-27 20:04:35 +00:00

Author	SHA1	Message	Date
FSSRepo	c02c52efb5	fix multiple clients	2023-10-17 17:54:56 -04:00
FSSRepo	d2b1fac6c7	fix make bui;d errors	2023-10-17 17:18:56 -04:00
FSSRepo	ed0c11cb83	multimodal support enabled by default	2023-10-17 16:58:20 -04:00
FSSRepo	6c277eaab5	update api like OpenAI	2023-10-17 16:53:38 -04:00
FSSRepo	58f8ae9bfe	readme change	2023-10-17 16:32:19 -04:00
FSSRepo	fa0f22f14f	Merge remote-tracking branch 'upstream/master'	2023-10-17 16:31:33 -04:00
FSSRepo	aa2268f4cd	sync README.md changes	2023-10-17 16:21:05 -04:00
Georgi Gerganov	e1675d133c	llama : avoid fprintf in favor of LLAMA_LOG (#3538 )	2023-10-17 22:34:26 +03:00
slaren	a5e8c1d8c7	train-text-from-scratch : fix assert failure in ggml-alloc (#3618 )	2023-10-17 20:00:58 +03:00
Georgi Gerganov	e74c705e15	editorconfig : remove trailing spaces	2023-10-17 19:52:53 +03:00
coezbek	3ad1e3f1a1	server : documentation of JSON return value of /completion endpoint (#3632 ) * Added documentation of JSON return value of /completion endpoint * Update examples/server/README.md --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-10-17 19:51:02 +03:00
Georgi Gerganov	1142013da4	save-load-state : fix example + add ci test (#3655 ) * save-load-state : fix example (close #3606) * ci : add test for save-load-state example ggml-ci	2023-10-17 19:12:46 +03:00
staviq	1a159553f9	tokenizer : special token handling (#3538 ) * Rewrite special token handling from #1931 * shorten param name, add st verification by type * use offsets instead of copy by substr * formatting, remove copying iterator on delete * llama : normalize code-style * swift fix * print pfx/sfx if verb, main: split pfx input sfx * dont add space when using special tokens * minor : comment + spacing --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-10-17 18:11:01 +03:00
Georgi Gerganov	940efa95fe	llava : fix tokenization to not add bos between image embeddings and user prompt (#3645 ) * llava : fix tokenization to not add bos after system prompt * set seed --------- Co-authored-by: M. Yusuf Sarıgöz <yusufsarigoz@gmail.com>	2023-10-16 23:58:00 +03:00
FSSRepo	4d1804330e	fix llava implementation	2023-10-16 16:31:17 -04:00
FSSRepo	d7eca255d7	context shift fixed	2023-10-16 14:43:10 -04:00
FSSRepo	2d9f11db28	fixed premature end due stop word	2023-10-16 12:36:05 -04:00
FSSRepo	fd64f04fc2	fix long prompt than ctx proposed in #3639	2023-10-15 19:07:18 -04:00
FSSRepo	b727e022d6	fix ci make build undefined ref errors	2023-10-15 18:53:48 -04:00
FSSRepo	ce961a304b	some ci fixes	2023-10-15 18:46:01 -04:00
Steward Garcia	9035978aae	Merge pull request #6 from damian0815/fssrepo_mac_fixes fix compilation errors with llvm	2023-10-15 18:38:52 -04:00
Steward Garcia	f47fd17b73	Merge branch 'ggerganov:master' into master	2023-10-15 18:23:47 -04:00
FSSRepo	4e5c5c451c	notify the user from server ui that multimodality is unavialable	2023-10-14 08:28:49 -04:00
M. Yusuf Sarıgöz	11dc1091f6	Honor -ngl option for Cuda offloading in llava (#3621 )	2023-10-14 04:52:44 -06:00
Damian Stewart	299f6b54d8	fix compilation errors with llvm	2023-10-14 11:17:38 +02:00
FSSRepo	7e64bfe060	refactor code + remove unused comments + improved README.md	2023-10-14 00:31:34 -04:00
FSSRepo	9f72b44635	add multimodal input - alfa	2023-10-13 23:36:32 -04:00
FSSRepo	de35b47908	fixed tokens probs	2023-10-13 19:55:25 -04:00
FSSRepo	9d98cdda2c	llava multimodal integration	2023-10-13 18:42:44 -04:00
FSSRepo	eb08201227	add changes to README.md	2023-10-13 14:28:06 -04:00
FSSRepo	a2c2d98c16	add context swap	2023-10-13 14:12:50 -04:00
FSSRepo	b6d9e212e5	fixed timings per slot	2023-10-13 13:10:38 -04:00
FSSRepo	a410a9e300	unused change reverted	2023-10-13 12:23:58 -04:00
FSSRepo	6358ae5f48	server ui now support multiple clients	2023-10-13 12:22:54 -04:00
FSSRepo	4ba5a5013d	chat.mjs support cached prompt + some fixes	2023-10-13 11:06:41 -04:00
slaren	424b6381c4	ggml : add context enumeration functions (#3605 ) finetune : fix assert failure in ggml-alloc	2023-10-13 12:23:10 +02:00
FSSRepo	500ac7120e	cached prompt support	2023-10-12 21:16:12 -04:00
FSSRepo	83c2b3553a	grammar + no stream completion	2023-10-12 18:43:57 -04:00
FSSRepo	5b8e29de53	multiple client support	2023-10-12 17:09:12 -04:00
FSSRepo	81484805f0	completion endpoint working	2023-10-12 16:17:27 -04:00
FSSRepo	29c8cdd65d	refactored sampling function	2023-10-12 15:02:19 -04:00
FSSRepo	b716eeb72a	Merge branch 'master' of https://github.com/ggerganov/llama.cpp	2023-10-12 12:55:08 -04:00
FSSRepo	78504218b9	save dev progress	2023-10-12 12:51:48 -04:00
M. Yusuf Sarıgöz	370359e5ba	examples: support LLaVA v1.5 (multimodal model) (#3436 ) * WIP: start implementing LLaVA * rm scratch buf for now, will revert after cleanup * LLaVA image encoder is working. will combine with llama * Add llava inference code, but it's buggy. debugging * LLaVA is working e2e, needs to optimize memory allocation + cleanup * Use ggml_allocr + rm unnecessary code * fix: crlf -> lf * fix: new line at EoF * fix: trailing whitespace * Add readme * Update readme * Some cleanup * Are you happy editorconfig? * rm unused batch image preprocessing * rm unused import * fix: rm designated initializers * introduce pad-to-square mode for non-square images * are you happy editorconfig? * gitignore /llava * Handle cases where image file does not exist * add llava target to Makefile * add support for 13b model variant * Maybe seed is unlucky? * Check if apples are compared to apples * are you happy editorconfig? * Use temperature = 0.1 by default * command line: use gpt_params_parse() * minor * handle default n_predict * fix typo * llava : code formatting, rename files, fix compile warnings * do not use Wno-cast-qual for MSVC --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-10-12 18:23:18 +03:00
Aarni Koskela	b016596d90	server : add completion mode (no chat) (#3582 )	2023-10-12 09:51:53 +03:00
Georgi Gerganov	57dd55e2c7	server : fix kv cache management (#3588 )	2023-10-12 09:29:04 +03:00
FSSRepo	471230202d	crash fixed	2023-10-11 19:48:15 -04:00
FSSRepo	63f99b1ea6	implementing parallel decoding in server example	2023-10-11 18:14:11 -04:00
Georgi Gerganov	b8fe4b5cc9	main : fix session loading bug (#3400 )	2023-10-11 23:55:41 +03:00
Michael Coppola	a8bdd65525	server : add parameter -tb N, --threads-batch N (#3584 ) Co-authored-by: Michael Coppola <info@michaeljcoppola.com>	2023-10-11 22:42:22 +03:00

1 2 3 4 5 ...

390 Commits