llama.cpp/examples
Eric Curtin 0cc63754b8
Introduce llama-run (#10291)
It's like simple-chat but it uses smart pointers to avoid manual
memory cleanups. Less memory leaks in the code now. Avoid printing
multiple dots. Split code into smaller functions. Uses no exception
handling.

Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2024-11-25 22:56:24 +01:00
..
batched speculative : refactor and add a simpler example (#10362) 2024-11-25 09:58:41 +02:00
batched-bench llama : remove all_pos_0, all_pos_1, all_seq_id from llama_batch (#9745) 2024-10-18 23:18:01 +02:00
batched.swift llama : llama_perf + option to disable timings during decode (#9355) 2024-09-13 09:53:38 +03:00
convert-llama2c-to-ggml common : use common_ prefix for common library functions (#9805) 2024-10-10 22:57:42 +02:00
cvector-generator llama : remove all_pos_0, all_pos_1, all_seq_id from llama_batch (#9745) 2024-10-18 23:18:01 +02:00
deprecation-warning examples : remove finetune and train-text-from-scratch (#8669) 2024-07-25 10:39:04 +02:00
embedding common : use common_ prefix for common library functions (#9805) 2024-10-10 22:57:42 +02:00
eval-callback ggml : add support for dynamic loading of backends (#10469) 2024-11-25 15:13:39 +01:00
export-lora common : use common_ prefix for common library functions (#9805) 2024-10-10 22:57:42 +02:00
gbnf-validator llama : refactor sampling v2 (#9294) 2024-09-07 15:16:19 +03:00
gen-docs common : use common_ prefix for common library functions (#9805) 2024-10-10 22:57:42 +02:00
gguf gguf : handle null name during init (#8587) 2024-07-20 17:15:42 +03:00
gguf-hash gguf-hash : update clib.json to point to original xxhash repo (#8491) 2024-07-16 10:14:16 +03:00
gguf-split gguf-split : improve --split and --merge logic (#9619) 2024-10-02 10:21:57 +03:00
gritlm common : use common_ prefix for common library functions (#9805) 2024-10-10 22:57:42 +02:00
imatrix llama : remove all_pos_0, all_pos_1, all_seq_id from llama_batch (#9745) 2024-10-18 23:18:01 +02:00
infill speculative : refactor and add a simpler example (#10362) 2024-11-25 09:58:41 +02:00
jeopardy build: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809) 2024-06-13 00:41:52 +01:00
llama-bench ggml : add support for dynamic loading of backends (#10469) 2024-11-25 15:13:39 +01:00
llama.android llama : remove all_pos_0, all_pos_1, all_seq_id from llama_batch (#9745) 2024-10-18 23:18:01 +02:00
llama.swiftui llama : default sampling changes + greedy update (#9897) 2024-10-21 09:46:40 +03:00
llava speculative : refactor and add a simpler example (#10362) 2024-11-25 09:58:41 +02:00
lookahead speculative : refactor and add a simpler example (#10362) 2024-11-25 09:58:41 +02:00
lookup speculative : refactor and add a simpler example (#10362) 2024-11-25 09:58:41 +02:00
main ggml : add support for dynamic loading of backends (#10469) 2024-11-25 15:13:39 +01:00
main-cmake-pkg Removes multiple newlines at the end of files that is breaking the editorconfig step of CI. (#8258) 2024-07-02 12:18:10 -04:00
parallel speculative : refactor and add a simpler example (#10362) 2024-11-25 09:58:41 +02:00
passkey common : use common_ prefix for common library functions (#9805) 2024-10-10 22:57:42 +02:00
perplexity llama/ex: remove --logdir argument (#10339) 2024-11-16 23:00:41 +01:00
quantize quantize : improve type name parsing (#9570) 2024-09-20 20:55:36 +02:00
quantize-stats ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
retrieval speculative : refactor and add a simpler example (#10362) 2024-11-25 09:58:41 +02:00
rpc ggml : move CPU backend to a separate file (#10144) 2024-11-03 19:34:08 +01:00
run Introduce llama-run (#10291) 2024-11-25 22:56:24 +01:00
save-load-state speculative : refactor and add a simpler example (#10362) 2024-11-25 09:58:41 +02:00
server server : enable cache_prompt by default (#10501) 2024-11-25 21:50:07 +02:00
simple ggml : add support for dynamic loading of backends (#10469) 2024-11-25 15:13:39 +01:00
simple-chat ggml : add support for dynamic loading of backends (#10469) 2024-11-25 15:13:39 +01:00
speculative llama : accept a list of devices to use to offload a model (#10497) 2024-11-25 19:30:06 +01:00
speculative-simple llama : accept a list of devices to use to offload a model (#10497) 2024-11-25 19:30:06 +01:00
sycl [SYCL]set context default value to avoid memory issue, update guide (#9476) 2024-09-18 08:30:31 +08:00
tokenize common : use common_ prefix for common library functions (#9805) 2024-10-10 22:57:42 +02:00
base-translate.sh build: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809) 2024-06-13 00:41:52 +01:00
chat-13B.bat Create chat-13B.bat (#592) 2023-03-29 20:21:09 +03:00
chat-13B.sh build: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809) 2024-06-13 00:41:52 +01:00
chat-persistent.sh scripts : fix pattern and get n_tokens in one go (#10221) 2024-11-09 09:06:54 +02:00
chat-vicuna.sh build: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809) 2024-06-13 00:41:52 +01:00
chat.sh build: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809) 2024-06-13 00:41:52 +01:00
CMakeLists.txt Introduce llama-run (#10291) 2024-11-25 22:56:24 +01:00
convert_legacy_llama.py metadata: Detailed Dataset Authorship Metadata (#8875) 2024-11-13 21:10:38 +11:00
json_schema_pydantic_example.py py : type-check all Python scripts with Pyright (#8341) 2024-07-07 15:04:39 -04:00
json_schema_to_grammar.py grammar : fix JSON Schema for string regex with top-level alt. (#9903) 2024-10-16 19:03:24 +03:00
llama.vim llama.vim : bump generation time limit to 3s [no ci] 2024-10-23 17:16:56 +03:00
llm.vim llm.vim : stop generation at multiple linebreaks, bind to <F2> (#2879) 2023-08-30 09:50:55 +03:00
Miku.sh build: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809) 2024-06-13 00:41:52 +01:00
pydantic_models_to_grammar_examples.py examples : Rewrite pydantic_models_to_grammar_examples.py (#8493) 2024-07-20 22:09:17 -04:00
pydantic_models_to_grammar.py pydantic : replace uses of __annotations__ with get_type_hints (#8474) 2024-07-14 19:51:21 -04:00
reason-act.sh build: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809) 2024-06-13 00:41:52 +01:00
regex_to_grammar.py py : switch to snake_case (#8305) 2024-07-05 07:53:33 +03:00
server_embd.py py : type-check all Python scripts with Pyright (#8341) 2024-07-07 15:04:39 -04:00
server-llama2-13B.sh build: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809) 2024-06-13 00:41:52 +01:00
ts-type-to-grammar.sh JSON schema conversion: ️ faster repetitions, min/maxLength for strings, cap number length (#6555) 2024-04-12 19:43:38 +01:00