llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2024-12-26 03:14:35 +00:00

History

Eric Curtin 0cc63754b8 Introduce llama-run (#10291 ) It's like simple-chat but it uses smart pointers to avoid manual memory cleanups. Less memory leaks in the code now. Avoid printing multiple dots. Split code into smaller functions. Uses no exception handling. Signed-off-by: Eric Curtin <ecurtin@redhat.com>		2024-11-25 22:56:24 +01:00
..
batched	speculative : refactor and add a simpler example (#10362 )	2024-11-25 09:58:41 +02:00
batched-bench	llama : remove all_pos_0, all_pos_1, all_seq_id from llama_batch (#9745 )	2024-10-18 23:18:01 +02:00
batched.swift	llama : llama_perf + option to disable timings during decode (#9355 )	2024-09-13 09:53:38 +03:00
convert-llama2c-to-ggml	common : use common_ prefix for common library functions (#9805 )	2024-10-10 22:57:42 +02:00
cvector-generator	llama : remove all_pos_0, all_pos_1, all_seq_id from llama_batch (#9745 )	2024-10-18 23:18:01 +02:00
deprecation-warning	examples : remove `finetune` and `train-text-from-scratch` (#8669 )	2024-07-25 10:39:04 +02:00
embedding	common : use common_ prefix for common library functions (#9805 )	2024-10-10 22:57:42 +02:00
eval-callback	ggml : add support for dynamic loading of backends (#10469 )	2024-11-25 15:13:39 +01:00
export-lora	common : use common_ prefix for common library functions (#9805 )	2024-10-10 22:57:42 +02:00
gbnf-validator	llama : refactor sampling v2 (#9294 )	2024-09-07 15:16:19 +03:00
gen-docs	common : use common_ prefix for common library functions (#9805 )	2024-10-10 22:57:42 +02:00
gguf	gguf : handle null name during init (#8587 )	2024-07-20 17:15:42 +03:00
gguf-hash	gguf-hash : update clib.json to point to original xxhash repo (#8491 )	2024-07-16 10:14:16 +03:00
gguf-split	gguf-split : improve --split and --merge logic (#9619 )	2024-10-02 10:21:57 +03:00
gritlm	common : use common_ prefix for common library functions (#9805 )	2024-10-10 22:57:42 +02:00
imatrix	llama : remove all_pos_0, all_pos_1, all_seq_id from llama_batch (#9745 )	2024-10-18 23:18:01 +02:00
infill	speculative : refactor and add a simpler example (#10362 )	2024-11-25 09:58:41 +02:00
jeopardy	`build`: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809 )	2024-06-13 00:41:52 +01:00
llama-bench	ggml : add support for dynamic loading of backends (#10469 )	2024-11-25 15:13:39 +01:00
llama.android	llama : remove all_pos_0, all_pos_1, all_seq_id from llama_batch (#9745 )	2024-10-18 23:18:01 +02:00
llama.swiftui	llama : default sampling changes + greedy update (#9897 )	2024-10-21 09:46:40 +03:00
llava	speculative : refactor and add a simpler example (#10362 )	2024-11-25 09:58:41 +02:00
lookahead	speculative : refactor and add a simpler example (#10362 )	2024-11-25 09:58:41 +02:00
lookup	speculative : refactor and add a simpler example (#10362 )	2024-11-25 09:58:41 +02:00
main	ggml : add support for dynamic loading of backends (#10469 )	2024-11-25 15:13:39 +01:00
main-cmake-pkg	Removes multiple newlines at the end of files that is breaking the editorconfig step of CI. (#8258 )	2024-07-02 12:18:10 -04:00
parallel	speculative : refactor and add a simpler example (#10362 )	2024-11-25 09:58:41 +02:00
passkey	common : use common_ prefix for common library functions (#9805 )	2024-10-10 22:57:42 +02:00
perplexity	llama/ex: remove --logdir argument (#10339 )	2024-11-16 23:00:41 +01:00
quantize	quantize : improve type name parsing (#9570 )	2024-09-20 20:55:36 +02:00
quantize-stats	ggml : build backends as libraries (#10256 )	2024-11-14 18:04:35 +01:00
retrieval	speculative : refactor and add a simpler example (#10362 )	2024-11-25 09:58:41 +02:00
rpc	ggml : move CPU backend to a separate file (#10144 )	2024-11-03 19:34:08 +01:00
run	Introduce llama-run (#10291 )	2024-11-25 22:56:24 +01:00
save-load-state	speculative : refactor and add a simpler example (#10362 )	2024-11-25 09:58:41 +02:00
server	server : enable cache_prompt by default (#10501 )	2024-11-25 21:50:07 +02:00
simple	ggml : add support for dynamic loading of backends (#10469 )	2024-11-25 15:13:39 +01:00
simple-chat	ggml : add support for dynamic loading of backends (#10469 )	2024-11-25 15:13:39 +01:00
speculative	llama : accept a list of devices to use to offload a model (#10497 )	2024-11-25 19:30:06 +01:00
speculative-simple	llama : accept a list of devices to use to offload a model (#10497 )	2024-11-25 19:30:06 +01:00
sycl	[SYCL]set context default value to avoid memory issue, update guide (#9476 )	2024-09-18 08:30:31 +08:00
tokenize	common : use common_ prefix for common library functions (#9805 )	2024-10-10 22:57:42 +02:00
base-translate.sh	`build`: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809 )	2024-06-13 00:41:52 +01:00
chat-13B.bat	Create chat-13B.bat (#592 )	2023-03-29 20:21:09 +03:00
chat-13B.sh	`build`: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809 )	2024-06-13 00:41:52 +01:00
chat-persistent.sh	scripts : fix pattern and get n_tokens in one go (#10221 )	2024-11-09 09:06:54 +02:00
chat-vicuna.sh	`build`: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809 )	2024-06-13 00:41:52 +01:00
chat.sh	`build`: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809 )	2024-06-13 00:41:52 +01:00
CMakeLists.txt	Introduce llama-run (#10291 )	2024-11-25 22:56:24 +01:00
convert_legacy_llama.py	metadata: Detailed Dataset Authorship Metadata (#8875 )	2024-11-13 21:10:38 +11:00
json_schema_pydantic_example.py	py : type-check all Python scripts with Pyright (#8341 )	2024-07-07 15:04:39 -04:00
json_schema_to_grammar.py	grammar : fix JSON Schema for string regex with top-level alt. (#9903 )	2024-10-16 19:03:24 +03:00
llama.vim	llama.vim : bump generation time limit to 3s [no ci]	2024-10-23 17:16:56 +03:00
llm.vim	llm.vim : stop generation at multiple linebreaks, bind to <F2> (#2879 )	2023-08-30 09:50:55 +03:00
Miku.sh	`build`: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809 )	2024-06-13 00:41:52 +01:00
pydantic_models_to_grammar_examples.py	examples : Rewrite pydantic_models_to_grammar_examples.py (#8493 )	2024-07-20 22:09:17 -04:00
pydantic_models_to_grammar.py	pydantic : replace uses of __annotations__ with get_type_hints (#8474 )	2024-07-14 19:51:21 -04:00
reason-act.sh	`build`: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809 )	2024-06-13 00:41:52 +01:00
regex_to_grammar.py	py : switch to snake_case (#8305 )	2024-07-05 07:53:33 +03:00
server_embd.py	py : type-check all Python scripts with Pyright (#8341 )	2024-07-07 15:04:39 -04:00
server-llama2-13B.sh	`build`: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809 )	2024-06-13 00:41:52 +01:00
ts-type-to-grammar.sh	JSON schema conversion: ⚡️ faster repetitions, min/maxLength for strings, cap number length (#6555 )	2024-04-12 19:43:38 +01:00