llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2024-12-28 12:24:35 +00:00

Author	SHA1	Message	Date
Georgi Gerganov	25b8a8922d	llama : introduce enum llama_vocab_type + remove hardcoded string constants	2023-08-18 18:46:38 +03:00
Georgi Gerganov	a4ad2bf35c	llama : fix MPI build ggml-ci	2023-08-18 17:34:27 +03:00
Georgi Gerganov	5d2656d670	llama : avoid hardcoded special tokens	2023-08-18 17:29:20 +03:00
Georgi Gerganov	035d511457	llama : minor API updates	2023-08-18 17:10:20 +03:00
Georgi Gerganov	2d6c2c757c	llama : remove C++ API + reorganize common source in /common dir	2023-08-18 16:22:48 +03:00
Georgi Gerganov	38016ed9ec	Merge branch 'master' into gguf	2023-08-18 15:21:48 +03:00
Georgi Gerganov	660ca9bbca	llama : re-order functions	2023-08-18 14:56:36 +03:00
slaren	097e121e2f	llama : add benchmark example (#2626 ) * llama : add benchmark example * add to examples CMakeLists.txt * fix msvc build * add missing include * add Bessel's correction to stdev calculation Co-authored-by: Johannes Gäßler <johannesg@5d6.de> * improve markdown formatting * add missing include * print warning is NDEBUG is not defined * remove n_prompt and n_gen from the matrix, use each value separately instead * better checks for non-optimized builds * llama.cpp : fix MEM_REQ_SCRATCH0 reusing the value of n_ctx of the first call * fix json formatting * add sql output * add basic cpu and gpu info (linx/cuda only) * markdown: also show values that differ from the default * markdown: add build id * cleanup * improve formatting * formatting --------- Co-authored-by: Johannes Gäßler <johannesg@5d6.de>	2023-08-18 12:44:58 +02:00
mdrokz	eaf98c2649	readme : add link to Rust bindings (#2656 )	2023-08-18 13:17:58 +03:00
Georgi Gerganov	e9b12c332e	perplexity : more meaningful ETA number - 2 decimal points	2023-08-18 12:48:55 +03:00
Georgi Gerganov	dea5be61d7	editorconfig : fix whitespaces	2023-08-18 12:42:38 +03:00
Georgi Gerganov	e35f8c744e	tests : update vocab file with new magic	2023-08-18 12:39:22 +03:00
Georgi Gerganov	856afff746	Merge branch 'master' into gguf	2023-08-18 12:38:05 +03:00
Georgi Gerganov	aa3efe87c8	llama : print number of tensors per type + print arch + style	2023-08-18 10:36:45 +03:00
klosax	b275de745d	llama.cpp : get special token kv and linefeed token id	2023-08-18 03:34:30 +02:00
Evan Jones	604b8bdfa6	Fix unicode in grammars (fixes #2501 ) (#2553 ) * Fix unicode in grammars (fixes #2501) * add more comments * fix test-llama-grammar	2023-08-17 19:54:44 -04:00
staviq	10151bee2e	server : support for saving templates in browser LocalStorage (#2486 ) * support for templates in browser LocalStorage * sync accepted #2409 fix from upstream * convert autosave invocation to useEffect * Apply suggestions from code review Co-authored-by: Jhen-Jie Hong <iainst0409@gmail.com> * Regen index.html.cpp, suggested from code review --------- Co-authored-by: Jhen-Jie Hong <iainst0409@gmail.com>	2023-08-18 07:34:01 +08:00
klosax	306070c896	llama.cpp : print kv general.name	2023-08-18 01:06:27 +02:00
Johannes Gäßler	0992a7b8b1	README: fix LLAMA_CUDA_MMV_Y documentation (#2647 )	2023-08-17 23:57:59 +02:00
klosax	d9e6890a51	test-tokenizer-0.cpp : fix warning	2023-08-17 23:34:21 +02:00
klosax	147a99bd3a	gguf.py : reverse GGUF_MAGIC	2023-08-17 23:24:04 +02:00
klosax	c20ae49b59	ggml.h : reverse GGUF_MAGIC	2023-08-17 23:23:17 +02:00
Henri Vasserman	6ddeefad9b	[Zig] Fixing Zig build and improvements (#2554 ) * Fix zig after console.o was split * Better include and flag management * Change LTO to option	2023-08-17 23:11:18 +03:00
klosax	3c1b7217a9	convert-llama-7b-pth-to-gguf.py : fixes	2023-08-17 21:44:34 +02:00
klosax	9e2d4dd48e	convert-llama-hf-to-gguf.py : fixes	2023-08-17 21:43:48 +02:00
klosax	640ddc4259	gguf.py : gptneox mapping	2023-08-17 21:43:10 +02:00
klosax	b668cd3296	convert-gptneox-hf-to-gguf.py : fixes	2023-08-17 21:42:26 +02:00
M. Yusuf Sarıgöz	fc3a523211	gguf.py : write tensors in a single pass (#2644 ) * gguf : single pass for writing tensors + refactoring writer * gguf : single pass for writing tensors + refactoring writer * gguf : single pass for writing tensors + refactoring writer * gguf : style fixes in simple conversion script * gguf : refactor gptneox conversion script * gguf : rename h5 to hf (for HuggingFace) * gguf : refactor pth to gguf conversion script * gguf : rm file_type key and method * gguf.py : fix vertical alignment * gguf.py : indentation --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-08-17 21:57:39 +03:00
Georgi Gerganov	5484737d58	llama : fix tensor name grepping during quantization ggml-ci	2023-08-17 21:40:51 +03:00
Georgi Gerganov	57eaadb853	llama : throw error if gguf fails to init from file ggml-ci	2023-08-17 21:32:14 +03:00
klosax	b3cc182990	llama.cpp : typo	2023-08-17 20:27:50 +02:00
Georgi Gerganov	acaa98234a	convert.py : fix HF tensor permuting / unpacking ggml-ci	2023-08-17 21:06:45 +03:00
klosax	78e1e57862	quantize-stats.cpp : .bin --> .gguf	2023-08-17 19:18:24 +02:00
klosax	fb11dd3f92	common.h : .bin --> .gguf	2023-08-17 19:16:35 +02:00
Georgi Gerganov	e72c8c2124	ggml : fix bug in gguf_set_kv ggml-ci	2023-08-17 20:13:48 +03:00
Georgi Gerganov	899f9a5350	llama : fix lambda capture ggml-ci	2023-08-17 19:49:45 +03:00
Georgi Gerganov	93f285bdf1	gptneox : move as a WIP example	2023-08-17 19:49:45 +03:00
Georgi Gerganov	81a2c2a6f4	llama : fix llama_model_loader memory leak	2023-08-17 19:49:02 +03:00
Georgi Gerganov	dd9e2fc988	ci : update ".bin" to ".gguf" extension ggml-ci	2023-08-17 19:32:14 +03:00
Georgi Gerganov	c3b739374e	editorconfig : ignore models folder ggml-ci	2023-08-17 19:17:25 +03:00
Georgi Gerganov	6d66ef96eb	Merge branch 'master' into gguf	2023-08-17 19:04:59 +03:00
Georgi Gerganov	11bf4366c2	llama : sync with recent PRs on master	2023-08-17 19:03:15 +03:00
Georgi Gerganov	8ace03ad3d	convert.py : better always have n_head_kv and default it to n_head	2023-08-17 18:47:06 +03:00
klosax	d646c4efce	convert.py : n_head_kv optional and .gguf file extension	2023-08-17 17:20:36 +02:00
Georgi Gerganov	dd016cc246	Revert "ci : disable CI temporary to not waste energy" This reverts commit `7e82d25f40`.	2023-08-17 17:23:16 +03:00
Georgi Gerganov	2ddd9681d6	convert.py : update to support GGUF output	2023-08-17 17:22:43 +03:00
Georgi Gerganov	e0429d38e4	convert-new.py : output gguf (#2635 ) * convert-new.py : output gguf (WIP) * convert-new.py : add gguf key-value pairs * llama : add hparams.ctx_train + no longer print ftype * convert-new.py : minor fixes * convert-new.py : vocab-only option should work now * llama : fix tokenizer to use llama_char_to_byte * tests : add new ggml-vocab-llama.gguf * convert-new.py : tensor name mapping * convert-new.py : add map for skipping tensor serialization * convert-new.py : convert script now works * gguf.py : pick some of the refactoring from #2644 * convert-new.py : minor fixes	2023-08-17 17:19:52 +03:00
Kerfuffle	8dae7ce684	Add --cfg-negative-prompt-file option for examples (#2591 ) Add --cfg-negative-prompt-file option for examples	2023-08-17 07:29:44 -06:00
klosax	d6fd53afd6	llama.cpp : use ggml_elements()	2023-08-17 15:24:35 +02:00
klosax	5a0a2c5685	llama.cpp : print actual model size	2023-08-17 15:18:16 +02:00

1 2 3 4 5 ...

1217 Commits