llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2024-12-28 04:14:35 +00:00

Author	SHA1	Message	Date
Jhen-Jie Hong	2feb8934eb	server : fix default grammar by use empty string in the UI (#2604 )	2023-08-14 16:20:17 +08:00
Jhen-Jie Hong	5517d6e692	server : implement json-schema-to-grammar.mjs & add grammar param in the UI (#2588 ) * server : implement json-schema-to-grammar.mjs by follow python impl * server : add grammar support in chat.mjs * server : implement grammer param in the UI * server : generate .hpp * server : remove trailing whitespaces * server : generate .hpp * server : fix sort of prop pairs * server : optimize regex & iteration	2023-08-14 15:16:54 +08:00
Georgi Gerganov	56a1f32072	Merge branch 'master' into gguf	2023-08-14 10:14:05 +03:00
M. Yusuf Sarıgöz	196b50fee7	gguf : add todos and comments	2023-08-14 08:50:47 +03:00
vxiiduu	f31b539714	Enhance Windows 7 and below compatibility. (#2592 ) * Enhance Windows 7 compatibility. * Clean away unnecessary preprocessor conditional	2023-08-13 20:59:16 -07:00
drbh	ee77efea2a	test : add simple grammar parsing tests (#2594 ) * adds simple grammar parsing tests * adds cassert header	2023-08-13 17:00:48 +03:00
M. Yusuf Sarıgöz	24f48833ab	fix conflicts	2023-08-13 16:55:42 +03:00
klosax	6beebf3fd9	gptneox-main.cpp : add file_type key	2023-08-13 14:11:01 +02:00
klosax	2827b840e4	convert-gptneox-h5-to-gguf.py : add file_type key	2023-08-13 13:54:10 +02:00
M. Yusuf Sarıgöz	bf2dad3100	convert : rm quantization version	2023-08-13 14:38:53 +03:00
M. Yusuf Sarıgöz	1d60468eee	fix conflicts	2023-08-13 13:35:40 +03:00
M. Yusuf Sarıgöz	91d4bfd536	convert : write more metadata for LLaMA	2023-08-13 13:29:46 +03:00
klosax	17800cd80f	convert-llama-h5-to-gguf.py : load model in parts to save memory	2023-08-13 12:20:02 +02:00
klosax	e3d1f07eb1	convert-gptneox-h5-to-gguf.py : load model in parts to save memory	2023-08-13 12:18:34 +02:00
klosax	9bf5a7efcb	Update gguf_tensor_map.py	2023-08-13 01:27:38 +02:00
Johannes Gäßler	f64d44a9b9	CUDA: Fixed OpenLLaMA 3b mmq, reduced compile time (#2590 )	2023-08-13 00:24:45 +02:00
klosax	c7bd8c147c	gptneox-main.cpp : n_layer --> n_block	2023-08-13 00:03:32 +02:00
klosax	e91a2224e4	convert-llama-h5-to-gguf.py : n_layer --> n_block	2023-08-13 00:02:44 +02:00
klosax	489616e126	convert-gptneox-h5-to-gguf.py : n_layer --> n_block	2023-08-13 00:02:04 +02:00
klosax	d2ce9cfe8d	gguf.py : n_layer --> n_block	2023-08-13 00:01:20 +02:00
klosax	8b5f0c5067	constants.py : n_layer --> n_block	2023-08-13 00:00:32 +02:00
klosax	5e58ffa1ed	gptneox-main.cpp : n_layer --> n_block	2023-08-12 23:50:58 +02:00
klosax	e606ffeaee	convert-llama-h5-to-gguf.py : simplify nbytes	2023-08-12 22:30:35 +02:00
klosax	f8218477b3	convert-gptneox-h5-to-gguf.py : simplify nbytes	2023-08-12 22:29:35 +02:00
klosax	4cef57c81a	convert-llama-h5-to-gguf.py : no need to convert tensors twice	2023-08-12 21:50:24 +02:00
klosax	8f09157ec9	convert-gptneox-h5-to-gguf.py : no need to convert tensors twice	2023-08-12 21:48:58 +02:00
klosax	5d81a715d4	gguf.py : no need to convert tensors twice	2023-08-12 21:45:45 +02:00
M. Yusuf Sarıgöz	60d540831b	gguf : roper closing of file	2023-08-12 21:42:31 +03:00
M. Yusuf Sarıgöz	202eab04d3	gguf : quantization is working	2023-08-12 16:39:05 +03:00
M. Yusuf Sarıgöz	1fc3d30b71	gguf : start implementing quantization (WIP)	2023-08-12 16:09:47 +03:00
M. Yusuf Sarıgöz	fa7c39540c	gguf : start implementing quantization (WIP)	2023-08-12 15:55:58 +03:00
M. Yusuf Sarıgöz	b2571af255	gguf : start implementing quantization (WIP)	2023-08-12 14:28:17 +03:00
M. Yusuf Sarıgöz	c4f02b4f74	gguf : start implementing quantization (WIP)	2023-08-12 12:01:17 +03:00
M. Yusuf Sarıgöz	0e1a3c7e7d	gguf : start implementing quantization (WIP)	2023-08-12 11:32:34 +03:00
M. Yusuf Sarıgöz	4fa017a1f9	gguf : start implementing quantization (WIP)	2023-08-12 10:40:56 +03:00
M. Yusuf Sarıgöz	186c496fdf	Merge branch 'gguf' of https://github.com//ggerganov/llama.cpp into gguf	2023-08-12 07:25:10 +03:00
M. Yusuf Sarıgöz	2f52008b20	gguf : rm references to old file magics	2023-08-12 07:24:46 +03:00
byte-6174	b19edd54d5	Adding support for llama2.c models (#2559 )	2023-08-12 01:17:25 +02:00
Equim	53dc399472	server: fixed wrong variable name in timing json (#2579 ) * server: fixed wrong variable name in timing json * remove redunct entry	2023-08-12 00:35:14 +02:00
klosax	e76c59d524	Update gptneox-main.cpp	2023-08-11 23:09:49 +02:00
klosax	2a5ac7af44	Update gguf_tensor_map.py	2023-08-11 23:08:48 +02:00
M. Yusuf Sarıgöz	e732423280	gguf : get rid of n_mult, read n_ff from file	2023-08-11 23:50:38 +03:00
M. Yusuf Sarıgöz	f44bbd3d88	gguf : rm redundant method	2023-08-11 21:00:51 +03:00
M. Yusuf Sarıgöz	7009cf581c	gguf : shorter name for member variable	2023-08-11 20:43:02 +03:00
M. Yusuf Sarıgöz	61919c1a8f	gguf : rm references to old file formats	2023-08-11 20:36:11 +03:00
M. Yusuf Sarıgöz	d09fd10713	gguf : write metadata in gguf_file_saver	2023-08-11 20:07:43 +03:00
M. Yusuf Sarıgöz	781b9ec3f5	gguf : write metadata in gguf_file_saver (WIP)	2023-08-11 18:01:26 +03:00
M. Yusuf Sarıgöz	28abfc90fa	gguf : write metadata in gguf_file_saver (WIP)	2023-08-11 13:27:58 +03:00
M. Yusuf Sarıgöz	e3a4960953	gguf : add gguf_get_kv_type	2023-08-11 13:03:23 +03:00
M. Yusuf Sarıgöz	eb8ca6996f	gguf : add gguf_get_kv_type	2023-08-11 12:24:08 +03:00

1 2 3 4 5 ...

1172 Commits