llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2024-12-25 10:54:36 +00:00

master

Python Type-Check / pyright type-check (push) Waiting to run

Details

flake8 Lint / Lint (push) Waiting to run

Details

9ba399dfa7 · server : add support for "encoding_format": "base64" to the */embeddings endpoints (#10967) · Updated 2024-12-24 20:33:04 +00:00

gg/fix-vld1q_s8_x4-4872 7216af5c09 · ggml : fix 32-bit ARM compat (cont) · Updated 2024-01-09 08:33:16 +00:00 root	2590 2	ZIP TAR.GZ
passkey d57cb9c294 · passkey : add readme · Updated 2024-01-08 09:13:44 +00:00 root	2600 7	ZIP TAR.GZ
gg/remove-gqa-check-4657 7cfde78190 · llama : remove redundant GQA check · Updated 2024-01-06 14:04:20 +00:00 root	2608 1	ZIP TAR.GZ
gg/metal-opt-mul-mat-id 9f51f3e695 · metal : opt mul_mm_id · Updated 2024-01-02 18:50:18 +00:00 root	2634 17	ZIP TAR.GZ
cuda-cublas-opts 4cc78d3873 · ggml : force F32 precision for ggml_mul_mat · Updated 2024-01-02 15:54:56 +00:00 root	2633 1	ZIP TAR.GZ
gg/avoid-mutex b5af7ad84f · llama : refactor quantization to avoid <mutex> header · Updated 2024-01-02 13:56:57 +00:00 root	2636 1	ZIP TAR.GZ
gg/hf-auto-dl 120a1a5515 · llama : auto download HF models if URL provided · Updated 2024-01-02 11:29:06 +00:00 root	2637 1	ZIP TAR.GZ
gg/gpu-prec-tests f64e4f04e7 · ggml : testing GPU FP precision via quantized CPY · Updated 2023-12-30 17:11:40 +00:00 root	2655 1	ZIP TAR.GZ
gg/test-arm f32f30bc57 · test · Updated 2023-12-26 15:52:42 +00:00 root	2685 1	ZIP TAR.GZ
gg/ggml_scale ab1b75166f · Merge branch 'master' into gg/ggml_scale · Updated 2023-12-21 20:35:11 +00:00 root	2708 4	ZIP TAR.GZ
ceb/fix-draft-model-default 7c87353e61 · common : remove incorrect --model-draft default · Updated 2023-12-21 17:17:12 +00:00 root	2716 1	ZIP TAR.GZ
gg/cublas-f32 a40f6110f0 · ggml : force F32 precision for ggml_mul_mat · Updated 2023-12-19 14:34:59 +00:00 root	2723 1	ZIP TAR.GZ
gg/plamo-test 3c734f4941 · plamo : testing · Updated 2023-12-18 15:06:05 +00:00 root	2728 13	ZIP TAR.GZ
gg/phi-2-2 a462159c43 · cuda : ggml_cuda_op_mul_mat_cublas support F32 precision · Updated 2023-12-18 12:24:29 +00:00 root	2728 16	ZIP TAR.GZ
ceb/fix-logit-check 1b05817112 · decode : fix logits_valid for old API · Updated 2023-12-17 23:49:21 +00:00 root	2729 1	ZIP TAR.GZ
gg/swiftui-bench 865066621b · llama.swiftui : improve bench · Updated 2023-12-17 17:37:22 +00:00 root	2743 12	ZIP TAR.GZ
pr/4484 f86b9d152c · lookup : minor · Updated 2023-12-17 15:25:28 +00:00 root	2741 9	ZIP TAR.GZ
gg/phi-2 d2f1e0dacc · Merge branch 'cuda-cublas-opts' into gg/phi-2 · Updated 2023-12-17 06:41:46 +00:00 root	2746 17	ZIP TAR.GZ
ceb/fix-badspecial-silentfail b0547d2196 · gguf-py : fail fast on nonsensical special token IDs · Updated 2023-12-15 23:06:42 +00:00 root	2748 1	ZIP TAR.GZ
ceb/fix-cuda-warning-flags c8554b80be · Merge branch 'master' of https://github.com/ggerganov/llama.cpp into ceb/fix-cuda-warning-flags · Updated 2023-12-13 17:06:01 +00:00 root	2760 12	ZIP TAR.GZ

... 11 12 13 14 15 ...

Default Branch

Branches