llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2024-12-27 03:44:35 +00:00

Author	SHA1	Message	Date
klosax	66756c82af	convert-llama-h5-to-gguf.py : add tensor data layout	2023-08-15 19:54:33 +02:00
klosax	b6056c3db8	gguf.py : add tensor data layout	2023-08-15 19:53:44 +02:00
klosax	2dd5d2c92c	convert-llama-h5-to-gguf.py : add 70b gqa support	2023-08-15 00:43:10 +02:00
klosax	ca4758290c	gguf-llama.cpp : fix n_head_kv	2023-08-14 23:18:41 +02:00
klosax	ab2cbd03ca	convert-llama-7b-pth-to-gguf.py : add token types	2023-08-14 22:10:50 +02:00
klosax	cedb4870c6	gguf.py : add token types	2023-08-14 22:08:40 +02:00
klosax	5d518d421f	constants.py : add token types	2023-08-14 22:07:53 +02:00
klosax	7ec125b1dc	convert-llama-h5-to-gguf.py : add token types	2023-08-14 22:06:33 +02:00
Georgi Gerganov	6c63550f63	llama : update tokenizer style	2023-08-14 22:11:57 +03:00
Georgi Gerganov	7494c78428	llama : sync gguf-llama with llama (#2613 ) * llama : sync gguf-llama with llama * tests : fix build + warnings (test-tokenizer-1 still fails) * tests : fix wstring_convert * convert : fix layer names * llama : sync gguf-llama.cpp * convert : update HF converter to new tokenizer voodoo magics	2023-08-14 21:33:33 +03:00
goerch	afc4ca2889	convert : update convert-new.py with tokenizer fixes (#2614 ) * Merge tokenizer fixes into the gguf branch. * Add test vocabularies * Adapt convert-new.py (and fix a clang-cl compiler error on windows)	2023-08-14 20:20:04 +03:00
goerch	ec1b100720	llama : tokenizer fixes (#2549 ) * Merge tokenizer fixes into the gguf branch. * Add test vocabularies	2023-08-14 19:30:28 +03:00
Georgi Gerganov	8af3a99ff1	Merge branch 'master' into gguf	2023-08-14 16:39:18 +03:00
Georgi Gerganov	6f14854880	gitignore : add gptneox-main	2023-08-14 16:39:02 +03:00
Jhen-Jie Hong	d783f7982e	metal : return null instead of exit(1) (#2573 )	2023-08-14 16:37:39 +03:00
Cheng Shao	d75561df20	server : add --numa support (#2524 )	2023-08-14 16:36:42 +03:00
Kamil Tomšík	348acf188c	llama : add missing enum keyword in function signatures (#2610 )	2023-08-14 16:35:16 +03:00
Georgi Gerganov	f00780b2ee	llama : sync gguf-llama.cpp with latest llama.cpp (#2608 ) * llama : sync gguf-llama.cpp with latest llama.cpp * minor : indentation + assert * llama : refactor gguf_buffer and gguf_ctx_buffer * llama : minor	2023-08-14 16:28:44 +03:00
klosax	6f64b6c0f8	Create convert-llama-7b-pth-to-gguf.py	2023-08-14 13:51:09 +02:00
Georgi Gerganov	62490f1380	gguf : use UNIX line ending	2023-08-14 13:04:35 +03:00
Georgi Gerganov	0c19ae70d5	simple : minor style changes	2023-08-14 12:58:12 +03:00
klosax	5c5a95ba2d	gguf.py : dont add empty strings	2023-08-14 11:22:06 +02:00
klosax	a7d226f871	convert-llama-h5-to-gguf.py : fixes	2023-08-14 11:14:24 +02:00
klosax	d753dfbcc8	gptneox-main.cpp : tensor name map changes	2023-08-14 10:59:18 +02:00
klosax	806a15749d	Delete gguf_tensor_map.py	2023-08-14 10:57:19 +02:00
klosax	51939d7d1b	Create gguf_namemap.py : tensor name map changes	2023-08-14 10:56:59 +02:00
klosax	5d22a9db13	convert-gptneox-h5-to-gguf.py : tensor name map changes	2023-08-14 10:55:44 +02:00
Johannes Gäßler	1cd06fa25e	CUDA: launch_bounds, small q4_K, q5_K mmq refactor (#2596 )	2023-08-14 10:41:22 +02:00
Jhen-Jie Hong	2feb8934eb	server : fix default grammar by use empty string in the UI (#2604 )	2023-08-14 16:20:17 +08:00
Jhen-Jie Hong	5517d6e692	server : implement json-schema-to-grammar.mjs & add grammar param in the UI (#2588 ) * server : implement json-schema-to-grammar.mjs by follow python impl * server : add grammar support in chat.mjs * server : implement grammer param in the UI * server : generate .hpp * server : remove trailing whitespaces * server : generate .hpp * server : fix sort of prop pairs * server : optimize regex & iteration	2023-08-14 15:16:54 +08:00
Georgi Gerganov	56a1f32072	Merge branch 'master' into gguf	2023-08-14 10:14:05 +03:00
M. Yusuf Sarıgöz	196b50fee7	gguf : add todos and comments	2023-08-14 08:50:47 +03:00
vxiiduu	f31b539714	Enhance Windows 7 and below compatibility. (#2592 ) * Enhance Windows 7 compatibility. * Clean away unnecessary preprocessor conditional	2023-08-13 20:59:16 -07:00
drbh	ee77efea2a	test : add simple grammar parsing tests (#2594 ) * adds simple grammar parsing tests * adds cassert header	2023-08-13 17:00:48 +03:00
M. Yusuf Sarıgöz	24f48833ab	fix conflicts	2023-08-13 16:55:42 +03:00
klosax	6beebf3fd9	gptneox-main.cpp : add file_type key	2023-08-13 14:11:01 +02:00
klosax	2827b840e4	convert-gptneox-h5-to-gguf.py : add file_type key	2023-08-13 13:54:10 +02:00
M. Yusuf Sarıgöz	bf2dad3100	convert : rm quantization version	2023-08-13 14:38:53 +03:00
M. Yusuf Sarıgöz	1d60468eee	fix conflicts	2023-08-13 13:35:40 +03:00
M. Yusuf Sarıgöz	91d4bfd536	convert : write more metadata for LLaMA	2023-08-13 13:29:46 +03:00
klosax	17800cd80f	convert-llama-h5-to-gguf.py : load model in parts to save memory	2023-08-13 12:20:02 +02:00
klosax	e3d1f07eb1	convert-gptneox-h5-to-gguf.py : load model in parts to save memory	2023-08-13 12:18:34 +02:00
klosax	9bf5a7efcb	Update gguf_tensor_map.py	2023-08-13 01:27:38 +02:00
Johannes Gäßler	f64d44a9b9	CUDA: Fixed OpenLLaMA 3b mmq, reduced compile time (#2590 )	2023-08-13 00:24:45 +02:00
klosax	c7bd8c147c	gptneox-main.cpp : n_layer --> n_block	2023-08-13 00:03:32 +02:00
klosax	e91a2224e4	convert-llama-h5-to-gguf.py : n_layer --> n_block	2023-08-13 00:02:44 +02:00
klosax	489616e126	convert-gptneox-h5-to-gguf.py : n_layer --> n_block	2023-08-13 00:02:04 +02:00
klosax	d2ce9cfe8d	gguf.py : n_layer --> n_block	2023-08-13 00:01:20 +02:00
klosax	8b5f0c5067	constants.py : n_layer --> n_block	2023-08-13 00:00:32 +02:00
klosax	5e58ffa1ed	gptneox-main.cpp : n_layer --> n_block	2023-08-12 23:50:58 +02:00

1 2 3 4 5 ...

1150 Commits