llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2024-12-26 03:14:35 +00:00

History

Gaetan Bisson 7bbb5acf12 Some checks are pending flake8 Lint / Lint (push) Waiting to run Details Python Type-Check / pyright type-check (push) Waiting to run Details server: avoid overwriting Authorization header (#10878 ) * server: avoid overwriting Authorization header If no API key is set, leave the Authorization header as is. It may be used by another part of the Web stack, such as an authenticating proxy. Fixes https://github.com/ggerganov/llama.cpp/issues/10854 * rebuild --------- Co-authored-by: Xuan Son Nguyen <son@huggingface.co>		2024-12-18 15:00:07 +01:00
..
batched	sampling : refactor + optimize penalties sampler (#10803 )	2024-12-16 12:31:14 +02:00
batched-bench	ggml : move AMX to the CPU backend (#10570 )	2024-11-29 21:54:58 +01:00
batched.swift	llama : llama_perf + option to disable timings during decode (#9355 )	2024-09-13 09:53:38 +03:00
convert-llama2c-to-ggml	make : deprecate (#10514 )	2024-12-02 21:22:53 +02:00
cvector-generator	ggml : move AMX to the CPU backend (#10570 )	2024-11-29 21:54:58 +01:00
deprecation-warning	Update deprecation-warning.cpp (#10619 )	2024-12-04 23:19:20 +01:00
embedding	ggml : move AMX to the CPU backend (#10570 )	2024-11-29 21:54:58 +01:00
eval-callback	ggml : move AMX to the CPU backend (#10570 )	2024-11-29 21:54:58 +01:00
export-lora	ggml : move AMX to the CPU backend (#10570 )	2024-11-29 21:54:58 +01:00
gbnf-validator	ggml : move AMX to the CPU backend (#10570 )	2024-11-29 21:54:58 +01:00
gen-docs	ggml : move AMX to the CPU backend (#10570 )	2024-11-29 21:54:58 +01:00
gguf	ggml : move AMX to the CPU backend (#10570 )	2024-11-29 21:54:58 +01:00
gguf-hash	ggml : move AMX to the CPU backend (#10570 )	2024-11-29 21:54:58 +01:00
gguf-split	remove CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS (#10797 )	2024-12-12 19:02:49 +01:00
gritlm	server : output embeddings for all tokens when pooling = none (#10861 )	2024-12-18 13:01:41 +02:00
imatrix	make : deprecate (#10514 )	2024-12-02 21:22:53 +02:00
infill	readme : add option, update default value, fix formatting (#10271 )	2024-12-03 12:50:08 +02:00
jeopardy	`build`: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809 )	2024-06-13 00:41:52 +01:00
llama-bench	remove CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS (#10797 )	2024-12-12 19:02:49 +01:00
llama.android	llama : remove all_pos_0, all_pos_1, all_seq_id from llama_batch (#9745 )	2024-10-18 23:18:01 +02:00
llama.swiftui	llama : use cmake for swift build (#10525 )	2024-12-08 13:14:54 +02:00
llava	llava : Allow locally downloaded models for QwenVL (#10833 )	2024-12-15 21:43:25 +01:00
lookahead	ggml : move AMX to the CPU backend (#10570 )	2024-11-29 21:54:58 +01:00
lookup	ggml : move AMX to the CPU backend (#10570 )	2024-11-29 21:54:58 +01:00
main	sampling : refactor + optimize penalties sampler (#10803 )	2024-12-16 12:31:14 +02:00
main-cmake-pkg	ggml : move AMX to the CPU backend (#10570 )	2024-11-29 21:54:58 +01:00
parallel	ggml : move AMX to the CPU backend (#10570 )	2024-11-29 21:54:58 +01:00
passkey	ggml : move AMX to the CPU backend (#10570 )	2024-11-29 21:54:58 +01:00
perplexity	ggml : move AMX to the CPU backend (#10570 )	2024-11-29 21:54:58 +01:00
quantize	Update README.md (#10772 )	2024-12-11 16:16:32 +01:00
quantize-stats	ggml : move AMX to the CPU backend (#10570 )	2024-11-29 21:54:58 +01:00
retrieval	server : output embeddings for all tokens when pooling = none (#10861 )	2024-12-18 13:01:41 +02:00
rpc	ggml : move CPU backend to a separate file (#10144 )	2024-11-03 19:34:08 +01:00
run	Opt class for positional argument handling (#10508 )	2024-12-13 19:34:25 +01:00
save-load-state	ggml : move AMX to the CPU backend (#10570 )	2024-11-29 21:54:58 +01:00
server	server: avoid overwriting Authorization header (#10878 )	2024-12-18 15:00:07 +01:00
simple	ggml : move AMX to the CPU backend (#10570 )	2024-11-29 21:54:58 +01:00
simple-chat	ggml : move AMX to the CPU backend (#10570 )	2024-11-29 21:54:58 +01:00
speculative	ggml : move AMX to the CPU backend (#10570 )	2024-11-29 21:54:58 +01:00
speculative-simple	ggml : move AMX to the CPU backend (#10570 )	2024-11-29 21:54:58 +01:00
sycl	[SYCL]set context default value to avoid memory issue, update guide (#9476 )	2024-09-18 08:30:31 +08:00
tokenize	remove CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS (#10797 )	2024-12-12 19:02:49 +01:00
chat-13B.bat	Create chat-13B.bat (#592 )	2023-03-29 20:21:09 +03:00
chat-13B.sh	`build`: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809 )	2024-06-13 00:41:52 +01:00
chat-persistent.sh	scripts : fix pattern and get n_tokens in one go (#10221 )	2024-11-09 09:06:54 +02:00
chat-vicuna.sh	`build`: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809 )	2024-06-13 00:41:52 +01:00
chat.sh	`build`: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809 )	2024-06-13 00:41:52 +01:00
CMakeLists.txt	common : improve -ctv -ctk CLI arguments (#10806 )	2024-12-12 22:53:05 +01:00
convert_legacy_llama.py	metadata: Detailed Dataset Authorship Metadata (#8875 )	2024-11-13 21:10:38 +11:00
json_schema_pydantic_example.py	py : type-check all Python scripts with Pyright (#8341 )	2024-07-07 15:04:39 -04:00
json_schema_to_grammar.py	grammar : fix JSON Schema for string regex with top-level alt. (#9903 )	2024-10-16 19:03:24 +03:00
llama.vim	llama.vim : bump generation time limit to 3s [no ci]	2024-10-23 17:16:56 +03:00
llm.vim	llm.vim : stop generation at multiple linebreaks, bind to <F2> (#2879 )	2023-08-30 09:50:55 +03:00
Miku.sh	`build`: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809 )	2024-06-13 00:41:52 +01:00
pydantic_models_to_grammar_examples.py	examples : Rewrite pydantic_models_to_grammar_examples.py (#8493 )	2024-07-20 22:09:17 -04:00
pydantic_models_to_grammar.py	pydantic : replace uses of __annotations__ with get_type_hints (#8474 )	2024-07-14 19:51:21 -04:00
reason-act.sh	`build`: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809 )	2024-06-13 00:41:52 +01:00
regex_to_grammar.py	py : switch to snake_case (#8305 )	2024-07-05 07:53:33 +03:00
server_embd.py	py : type-check all Python scripts with Pyright (#8341 )	2024-07-07 15:04:39 -04:00
server-llama2-13B.sh	`build`: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809 )	2024-06-13 00:41:52 +01:00
ts-type-to-grammar.sh	JSON schema conversion: ⚡️ faster repetitions, min/maxLength for strings, cap number length (#6555 )	2024-04-12 19:43:38 +01:00