llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2024-12-25 02:44:36 +00:00

History

Daniel Bevenius 8228b66dbc gguf : add option to not check tensor data (#6582 ) This commit adds an option to the gguf example to not check the tensor data. The motivation for this is that it can be nice to use the gguf tool to read other .gguf files that were not created by the gguf tool. Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>		2024-04-10 21:16:48 +03:00
..
baby-llama	code : normalize enum names (#5697 )	2024-02-25 12:09:09 +02:00
batched	metal : pad n_ctx by 32 (#6177 )	2024-03-22 09:36:03 +02:00
batched-bench	bench : make n_batch and n_ubatch configurable in Batched bench (#6500 )	2024-04-05 21:34:53 +03:00
batched.swift	ggml : add numa options (#5377 )	2024-02-16 11:31:07 +02:00
beam-search	ggml : add numa options (#5377 )	2024-02-16 11:31:07 +02:00
benchmark	ggml : remove old quantization functions (#5942 )	2024-03-09 15:53:59 +02:00
convert-llama2c-to-ggml	llama2c : open file as binary (#6332 )	2024-03-27 09:16:02 +02:00
embedding	BERT tokenizer fixes (#6498 )	2024-04-09 13:44:08 -04:00
export-lora	ci : add an option to fail on compile warning (#3952 )	2024-02-17 23:03:14 +02:00
finetune	code : normalize enum names (#5697 )	2024-02-25 12:09:09 +02:00
gbnf-validator	examples : add GBNF validator program (#5948 )	2024-04-04 10:44:28 +03:00
gguf	gguf : add option to not check tensor data (#6582 )	2024-04-10 21:16:48 +03:00
gguf-split	split: allow --split-max-size option (#6343 )	2024-03-29 22:34:44 +01:00
gritlm	gritlm : add initial README.md (#6086 )	2024-03-16 17:46:29 +02:00
imatrix	BERT tokenizer fixes (#6498 )	2024-04-09 13:44:08 -04:00
infill	BERT tokenizer fixes (#6498 )	2024-04-09 13:44:08 -04:00
jeopardy	parallel : add option to load external prompt file (#3416 )	2023-10-06 16:16:38 +03:00
llama-bench	cuda : rename build flag to LLAMA_CUDA (#6299 )	2024-03-26 01:16:01 +01:00
llama.android	android : fix utf8 decoding error (#5935 )	2024-03-10 22:03:17 +02:00
llama.swiftui	llama : add pipeline parallelism support (#6017 )	2024-03-13 18:54:21 +01:00
llava	BERT tokenizer fixes (#6498 )	2024-04-09 13:44:08 -04:00
lookahead	BERT tokenizer fixes (#6498 )	2024-04-09 13:44:08 -04:00
lookup	BERT tokenizer fixes (#6498 )	2024-04-09 13:44:08 -04:00
main	BERT tokenizer fixes (#6498 )	2024-04-09 13:44:08 -04:00
main-cmake-pkg	cuda : rename build flag to LLAMA_CUDA (#6299 )	2024-03-26 01:16:01 +01:00
parallel	llama : greatly reduce output buffer memory usage (#6122 )	2024-03-26 16:46:41 +02:00
passkey	llama : fix defrag bugs + add parameter (#5735 )	2024-02-27 14:35:51 +02:00
perplexity	BERT tokenizer fixes (#6498 )	2024-04-09 13:44:08 -04:00
quantize	ggml : mul_mat_id use the same tensor for all the experts (#6387 )	2024-04-03 16:07:05 +03:00
quantize-stats	refactor : switch to emplace_back to avoid extra object (#5291 )	2024-02-03 13:23:37 +02:00
retrieval	examples : add "retrieval" (#6193 )	2024-03-25 09:38:22 +02:00
save-load-state	llama : save and restore kv cache for single seq id (#6341 )	2024-04-08 15:43:30 +03:00
server	minor layout improvements (#6572 )	2024-04-10 19:18:25 +02:00
simple	ggml : add numa options (#5377 )	2024-02-16 11:31:07 +02:00
speculative	BERT tokenizer fixes (#6498 )	2024-04-09 13:44:08 -04:00
sycl	[SYCL] fix SYCL backend build on windows is break by LOG() error (#6290 )	2024-03-25 15:52:41 +08:00
tokenize	BERT tokenizer fixes (#6498 )	2024-04-09 13:44:08 -04:00
train-text-from-scratch	gguf : fix resource leaks (#6061 )	2024-03-14 20:29:32 +02:00
alpaca.sh	alpaca.sh : update model file name (#2074 )	2023-07-06 19:17:50 +03:00
base-translate.sh	examples : improve base-translate.sh script (#4783 )	2024-01-06 11:40:24 +02:00
chat-13B.bat	Create chat-13B.bat (#592 )	2023-03-29 20:21:09 +03:00
chat-13B.sh	examples : read chat prompts from a template file (#1196 )	2023-05-03 20:58:11 +03:00
chat-persistent.sh	llama : fix session saving/loading (#3400 )	2023-10-03 21:04:01 +03:00
chat-vicuna.sh	examples : add chat-vicuna.sh (#1854 )	2023-06-15 21:05:53 +03:00
chat.sh	main : log file (#2748 )	2023-08-30 09:29:32 +03:00
CMakeLists.txt	examples : add "retrieval" (#6193 )	2024-03-25 09:38:22 +02:00
gpt4all.sh	examples : add -n to alpaca and gpt4all scripts (#706 )	2023-04-13 16:03:39 +03:00
json-schema-pydantic-example.py	json-schema-to-grammar improvements (+ added to server) (#5978 )	2024-03-21 11:50:43 +00:00
json-schema-to-grammar.py	json-schema-to-grammar : fix order of props + non-str const/enum (#6232 )	2024-03-22 15:07:44 +02:00
llama2-13b.sh	gitignore : changes for Poetry users + chat examples (#2284 )	2023-07-21 13:53:27 +03:00
llama2.sh	gitignore : changes for Poetry users + chat examples (#2284 )	2023-07-21 13:53:27 +03:00
llama.vim	llama.vim : added api key support (#5090 )	2024-01-23 08:51:27 +02:00
llm.vim	llm.vim : stop generation at multiple linebreaks, bind to <F2> (#2879 )	2023-08-30 09:50:55 +03:00
make-ggml.py	make-ggml.py : compatibility with more models and GGUF (#3290 )	2023-09-27 19:25:12 +03:00
Miku.sh	MIKU MAYHEM: Upgrading the Default Model for Maximum Fun 🎉 (#2287 )	2023-07-21 11:13:18 +03:00
pydantic_models_to_grammar.py	examples : make pydantic scripts pass mypy and support py3.8 (#5099 )	2024-01-25 14:51:24 -05:00
pydantic-models-to-grammar-examples.py	examples : make pydantic scripts pass mypy and support py3.8 (#5099 )	2024-01-25 14:51:24 -05:00
reason-act.sh	chmod : make scripts executable (#2675 )	2023-08-23 17:29:09 +03:00
regex-to-grammar.py	json-schema-to-grammar improvements (+ added to server) (#5978 )	2024-03-21 11:50:43 +00:00
server-embd.py	server : refactor (#5882 )	2024-03-07 11:41:53 +02:00
server-llama2-13B.sh	chmod : make scripts executable (#2675 )	2023-08-23 17:29:09 +03:00
ts-type-to-grammar.sh	json-schema-to-grammar improvements (+ added to server) (#5978 )	2024-03-21 11:50:43 +00:00