llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2024-11-11 21:39:52 +00:00

History

Brian f7cab35ef9 gguf-hash: model wide and per tensor hashing using xxhash and sha1 (#8048 ) CLI to hash GGUF files to detect difference on a per model and per tensor level The hash type we support is: - `--xxh64`: use xhash 64bit hash mode (default) - `--sha1`: use sha1 - `--uuid`: use uuid - `--sha256`: use sha256 While most POSIX systems already have hash checking programs like sha256sum, it is designed to check entire files. This is not ideal for our purpose if we want to check for consistency of the tensor data even if the metadata content of the gguf KV store has been updated. This program is designed to hash a gguf tensor payload on a 'per tensor layer' in addition to a 'entire tensor model' hash. The intent is that the entire tensor layer can be checked first but if there is any detected inconsistencies, then the per tensor hash can be used to narrow down the specific tensor layer that has inconsistencies. Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>		2024-07-07 22:58:43 +10:00
..
baby-llama	`build`: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809 )	2024-06-13 00:41:52 +01:00
batched	Inference support for T5 and FLAN-T5 model families (#5763 )	2024-07-04 15:46:11 +02:00
batched-bench	`build`: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809 )	2024-06-13 00:41:52 +01:00
batched.swift	Detokenizer fixes (#8039 )	2024-07-05 19:01:35 +02:00
benchmark	`build`: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809 )	2024-06-13 00:41:52 +01:00
convert-llama2c-to-ggml	`build`: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809 )	2024-06-13 00:41:52 +01:00
cvector-generator	cvector: better prompt handling, add "mean vector" method (#8069 )	2024-06-25 13:59:54 +02:00
embedding	Removes multiple newlines at the end of files that is breaking the editorconfig step of CI. (#8258 )	2024-07-02 12:18:10 -04:00
eval-callback	`build`: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809 )	2024-06-13 00:41:52 +01:00
export-lora	`build`: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809 )	2024-06-13 00:41:52 +01:00
finetune	finetune: Rename command name in README.md (#8343 )	2024-07-07 13:38:02 +03:00
gbnf-validator	llama : return nullptr from llama_grammar_init (#8093 )	2024-06-25 15:07:28 -04:00
gguf	`build`: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809 )	2024-06-13 00:41:52 +01:00
gguf-hash	gguf-hash: model wide and per tensor hashing using xxhash and sha1 (#8048 )	2024-07-07 22:58:43 +10:00
gguf-split	`build`: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809 )	2024-06-13 00:41:52 +01:00
gritlm	llama : allow pooled embeddings on any model (#7477 )	2024-06-21 08:38:22 +03:00
imatrix	llama : reorganize source code + improve CMake (#8006 )	2024-06-26 18:33:02 +03:00
infill	Removes multiple newlines at the end of files that is breaking the editorconfig step of CI. (#8258 )	2024-07-02 12:18:10 -04:00
jeopardy	`build`: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809 )	2024-06-13 00:41:52 +01:00
llama-bench	llama-bench : fix RPC indication (#7936 )	2024-06-14 16:47:41 +03:00
llama.android	Delete examples/llama.android/llama/CMakeLists.txt (#8165 )	2024-06-27 16:39:29 +02:00
llama.swiftui	Detokenizer fixes (#8039 )	2024-07-05 19:01:35 +02:00
llava	py : use cpu-only torch in requirements.txt (#8335 )	2024-07-07 14:23:38 +03:00
lookahead	`build`: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809 )	2024-06-13 00:41:52 +01:00
lookup	Removes multiple newlines at the end of files that is breaking the editorconfig step of CI. (#8258 )	2024-07-02 12:18:10 -04:00
main	cli: add EOT when user hit Ctrl+C (#8296 )	2024-07-04 20:55:03 +02:00
main-cmake-pkg	Removes multiple newlines at the end of files that is breaking the editorconfig step of CI. (#8258 )	2024-07-02 12:18:10 -04:00
parallel	`build`: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809 )	2024-06-13 00:41:52 +01:00
passkey	passkey : add short intro to README.md [no-ci] (#8317 )	2024-07-05 09:14:24 +03:00
perplexity	ppl : fix n_seq_max for perplexity (#8277 )	2024-07-03 20:33:31 +03:00
quantize	Reorganize documentation pages (#8325 )	2024-07-05 18:08:32 +02:00
quantize-stats	`build`: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809 )	2024-06-13 00:41:52 +01:00
retrieval	llama : allow pooled embeddings on any model (#7477 )	2024-06-21 08:38:22 +03:00
rpc	llama : reorganize source code + improve CMake (#8006 )	2024-06-26 18:33:02 +03:00
save-load-state	`build`: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809 )	2024-06-13 00:41:52 +01:00
server	server: Retrieve prompt template in /props (#8337 )	2024-07-07 11:10:38 +02:00
simple	`build`: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809 )	2024-06-13 00:41:52 +01:00
speculative	`build`: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809 )	2024-06-13 00:41:52 +01:00
sycl	Removes multiple newlines at the end of files that is breaking the editorconfig step of CI. (#8258 )	2024-07-02 12:18:10 -04:00
tokenize	tokenize : add --show-count (token) option (#8299 )	2024-07-04 19:38:58 +03:00
train-text-from-scratch	py : switch to snake_case (#8305 )	2024-07-05 07:53:33 +03:00
base-translate.sh	`build`: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809 )	2024-06-13 00:41:52 +01:00
chat-13B.bat	Create chat-13B.bat (#592 )	2023-03-29 20:21:09 +03:00
chat-13B.sh	`build`: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809 )	2024-06-13 00:41:52 +01:00
chat-persistent.sh	`build`: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809 )	2024-06-13 00:41:52 +01:00
chat-vicuna.sh	`build`: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809 )	2024-06-13 00:41:52 +01:00
chat.sh	`build`: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809 )	2024-06-13 00:41:52 +01:00
CMakeLists.txt	gguf-hash: model wide and per tensor hashing using xxhash and sha1 (#8048 )	2024-07-07 22:58:43 +10:00
convert_legacy_llama.py	py : switch to snake_case (#8305 )	2024-07-05 07:53:33 +03:00
json_schema_pydantic_example.py	py : switch to snake_case (#8305 )	2024-07-05 07:53:33 +03:00
json_schema_to_grammar.py	`json`: restore default additionalProperties to false, fix some pattern escapes (#8180 )	2024-06-28 09:26:45 +01:00
llama.vim	llama.vim : added api key support (#5090 )	2024-01-23 08:51:27 +02:00
llm.vim	llm.vim : stop generation at multiple linebreaks, bind to <F2> (#2879 )	2023-08-30 09:50:55 +03:00
Miku.sh	`build`: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809 )	2024-06-13 00:41:52 +01:00
pydantic_models_to_grammar_examples.py	py : switch to snake_case (#8305 )	2024-07-05 07:53:33 +03:00
pydantic_models_to_grammar.py	grammars: x{min,max} repetition operator (#6640 )	2024-06-06 10:07:06 +01:00
reason-act.sh	`build`: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809 )	2024-06-13 00:41:52 +01:00
regex_to_grammar.py	py : switch to snake_case (#8305 )	2024-07-05 07:53:33 +03:00
server_embd.py	py : switch to snake_case (#8305 )	2024-07-05 07:53:33 +03:00
server-llama2-13B.sh	`build`: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809 )	2024-06-13 00:41:52 +01:00
ts-type-to-grammar.sh	JSON schema conversion: ⚡️ faster repetitions, min/maxLength for strings, cap number length (#6555 )	2024-04-12 19:43:38 +01:00