llama.cpp/examples
Gaetan Bisson 7bbb5acf12
Some checks are pending
flake8 Lint / Lint (push) Waiting to run
Python Type-Check / pyright type-check (push) Waiting to run
server: avoid overwriting Authorization header (#10878)
* server: avoid overwriting Authorization header

If no API key is set, leave the Authorization header as is. It may be
used by another part of the Web stack, such as an authenticating proxy.

Fixes https://github.com/ggerganov/llama.cpp/issues/10854

* rebuild

---------

Co-authored-by: Xuan Son Nguyen <son@huggingface.co>
2024-12-18 15:00:07 +01:00
..
batched sampling : refactor + optimize penalties sampler (#10803) 2024-12-16 12:31:14 +02:00
batched-bench ggml : move AMX to the CPU backend (#10570) 2024-11-29 21:54:58 +01:00
batched.swift llama : llama_perf + option to disable timings during decode (#9355) 2024-09-13 09:53:38 +03:00
convert-llama2c-to-ggml make : deprecate (#10514) 2024-12-02 21:22:53 +02:00
cvector-generator ggml : move AMX to the CPU backend (#10570) 2024-11-29 21:54:58 +01:00
deprecation-warning Update deprecation-warning.cpp (#10619) 2024-12-04 23:19:20 +01:00
embedding ggml : move AMX to the CPU backend (#10570) 2024-11-29 21:54:58 +01:00
eval-callback ggml : move AMX to the CPU backend (#10570) 2024-11-29 21:54:58 +01:00
export-lora ggml : move AMX to the CPU backend (#10570) 2024-11-29 21:54:58 +01:00
gbnf-validator ggml : move AMX to the CPU backend (#10570) 2024-11-29 21:54:58 +01:00
gen-docs ggml : move AMX to the CPU backend (#10570) 2024-11-29 21:54:58 +01:00
gguf ggml : move AMX to the CPU backend (#10570) 2024-11-29 21:54:58 +01:00
gguf-hash ggml : move AMX to the CPU backend (#10570) 2024-11-29 21:54:58 +01:00
gguf-split remove CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS (#10797) 2024-12-12 19:02:49 +01:00
gritlm server : output embeddings for all tokens when pooling = none (#10861) 2024-12-18 13:01:41 +02:00
imatrix make : deprecate (#10514) 2024-12-02 21:22:53 +02:00
infill readme : add option, update default value, fix formatting (#10271) 2024-12-03 12:50:08 +02:00
jeopardy build: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809) 2024-06-13 00:41:52 +01:00
llama-bench remove CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS (#10797) 2024-12-12 19:02:49 +01:00
llama.android llama : remove all_pos_0, all_pos_1, all_seq_id from llama_batch (#9745) 2024-10-18 23:18:01 +02:00
llama.swiftui llama : use cmake for swift build (#10525) 2024-12-08 13:14:54 +02:00
llava llava : Allow locally downloaded models for QwenVL (#10833) 2024-12-15 21:43:25 +01:00
lookahead ggml : move AMX to the CPU backend (#10570) 2024-11-29 21:54:58 +01:00
lookup ggml : move AMX to the CPU backend (#10570) 2024-11-29 21:54:58 +01:00
main sampling : refactor + optimize penalties sampler (#10803) 2024-12-16 12:31:14 +02:00
main-cmake-pkg ggml : move AMX to the CPU backend (#10570) 2024-11-29 21:54:58 +01:00
parallel ggml : move AMX to the CPU backend (#10570) 2024-11-29 21:54:58 +01:00
passkey ggml : move AMX to the CPU backend (#10570) 2024-11-29 21:54:58 +01:00
perplexity ggml : move AMX to the CPU backend (#10570) 2024-11-29 21:54:58 +01:00
quantize Update README.md (#10772) 2024-12-11 16:16:32 +01:00
quantize-stats ggml : move AMX to the CPU backend (#10570) 2024-11-29 21:54:58 +01:00
retrieval server : output embeddings for all tokens when pooling = none (#10861) 2024-12-18 13:01:41 +02:00
rpc ggml : move CPU backend to a separate file (#10144) 2024-11-03 19:34:08 +01:00
run Opt class for positional argument handling (#10508) 2024-12-13 19:34:25 +01:00
save-load-state ggml : move AMX to the CPU backend (#10570) 2024-11-29 21:54:58 +01:00
server server: avoid overwriting Authorization header (#10878) 2024-12-18 15:00:07 +01:00
simple ggml : move AMX to the CPU backend (#10570) 2024-11-29 21:54:58 +01:00
simple-chat ggml : move AMX to the CPU backend (#10570) 2024-11-29 21:54:58 +01:00
speculative ggml : move AMX to the CPU backend (#10570) 2024-11-29 21:54:58 +01:00
speculative-simple ggml : move AMX to the CPU backend (#10570) 2024-11-29 21:54:58 +01:00
sycl [SYCL]set context default value to avoid memory issue, update guide (#9476) 2024-09-18 08:30:31 +08:00
tokenize remove CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS (#10797) 2024-12-12 19:02:49 +01:00
chat-13B.bat Create chat-13B.bat (#592) 2023-03-29 20:21:09 +03:00
chat-13B.sh build: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809) 2024-06-13 00:41:52 +01:00
chat-persistent.sh scripts : fix pattern and get n_tokens in one go (#10221) 2024-11-09 09:06:54 +02:00
chat-vicuna.sh build: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809) 2024-06-13 00:41:52 +01:00
chat.sh build: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809) 2024-06-13 00:41:52 +01:00
CMakeLists.txt common : improve -ctv -ctk CLI arguments (#10806) 2024-12-12 22:53:05 +01:00
convert_legacy_llama.py metadata: Detailed Dataset Authorship Metadata (#8875) 2024-11-13 21:10:38 +11:00
json_schema_pydantic_example.py py : type-check all Python scripts with Pyright (#8341) 2024-07-07 15:04:39 -04:00
json_schema_to_grammar.py grammar : fix JSON Schema for string regex with top-level alt. (#9903) 2024-10-16 19:03:24 +03:00
llama.vim llama.vim : bump generation time limit to 3s [no ci] 2024-10-23 17:16:56 +03:00
llm.vim llm.vim : stop generation at multiple linebreaks, bind to <F2> (#2879) 2023-08-30 09:50:55 +03:00
Miku.sh build: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809) 2024-06-13 00:41:52 +01:00
pydantic_models_to_grammar_examples.py examples : Rewrite pydantic_models_to_grammar_examples.py (#8493) 2024-07-20 22:09:17 -04:00
pydantic_models_to_grammar.py pydantic : replace uses of __annotations__ with get_type_hints (#8474) 2024-07-14 19:51:21 -04:00
reason-act.sh build: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809) 2024-06-13 00:41:52 +01:00
regex_to_grammar.py py : switch to snake_case (#8305) 2024-07-05 07:53:33 +03:00
server_embd.py py : type-check all Python scripts with Pyright (#8341) 2024-07-07 15:04:39 -04:00
server-llama2-13B.sh build: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809) 2024-06-13 00:41:52 +01:00
ts-type-to-grammar.sh JSON schema conversion: ️ faster repetitions, min/maxLength for strings, cap number length (#6555) 2024-04-12 19:43:38 +01:00