llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2024-12-26 03:14:35 +00:00

master

Python Type-Check / pyright type-check (push) Waiting to run

Details

flake8 Lint / Lint (push) Waiting to run

Details

9ba399dfa7 · server : add support for "encoding_format": "base64" to the */embeddings endpoints (#10967) · Updated 2024-12-24 20:33:04 +00:00

gg/flash-attn-mask-f16 1ad42b1f1e · ggml : ggml_soft_max uses F16 mask · Updated 2024-01-31 18:33:59 +00:00 root	2355 36	ZIP TAR.GZ
ik/fix_iq3xxs_metal 719a087138 · iq3_xxs: forgotten update of the grid points · Updated 2024-01-30 16:39:07 +00:00 root	2369 1	ZIP TAR.GZ
gg/flash-attn-simd 2bf91c5306 · metal : clean up · Updated 2024-01-25 11:29:45 +00:00 root	2465 23	ZIP TAR.GZ
gg/flash-attn-wip3 6ccbd1777a · wip · Updated 2024-01-24 13:45:04 +00:00 root	2465 18	ZIP TAR.GZ
gg/flash-attn-wip4 da23b56f25 · wip : no ic 8 step · Updated 2024-01-24 11:25:34 +00:00 root	2465 18	ZIP TAR.GZ
gg/flash-attn-wip2 06c2d0d117 · wip · Updated 2024-01-23 20:42:43 +00:00 root	2465 14	ZIP TAR.GZ
gg/flash-attn-online a9681febd6 · ggml : online attention (CPU) · Updated 2024-01-20 14:45:41 +00:00 root	2465 4	ZIP TAR.GZ
ceb/fix-msvc-build 32a392fe68 · try a differerent fix · Updated 2024-01-19 22:10:23 +00:00 root	2466 2	ZIP TAR.GZ
ceb/restore-convert 4a3bc1522e · py : linting with mypy and isort · Updated 2024-01-19 20:18:58 +00:00 root	2467 3	ZIP TAR.GZ
ceb/nomic-vulkan-fix-add 1453215165 · kompute : fix ggml_add kernel · Updated 2024-01-18 22:09:16 +00:00 root	2583 105	ZIP TAR.GZ
ik/faster_hellaswag ccc78a200e · hellaswag: speed up even more by parallelizing log-prob evaluation · Updated 2024-01-18 16:25:29 +00:00 root	2483 1	ZIP TAR.GZ
gg/imatrix-gpu-4931 2917e6b528 · Merge branch 'master' into gg/imatrix-gpu-4931 · Updated 2024-01-17 16:43:45 +00:00 root	2490 10	ZIP TAR.GZ
gg/fix-spm-added-tokens-dict-4958 23742deb5b · py : fix padded dummy tokens (I hope) · Updated 2024-01-17 13:44:22 +00:00 root	2509 4	ZIP TAR.GZ
ik/better_q2_k_s 9fd1e83f6d · Use Q4_K for attn_v for Q2_K_S when n_gqa >= 4 · Updated 2024-01-17 10:16:08 +00:00 root	2495 1	ZIP TAR.GZ
gg/iq2-refactor-and-tests 49bafe0986 · tests : avoid creating RNGs for each tensor · Updated 2024-01-17 08:40:55 +00:00 root	2498 6	ZIP TAR.GZ
ik/imatrix_legacy_quants bb9abb5cd8 · imatrix: guard Q4_0/Q5_0 against ffn_down craziness · Updated 2024-01-16 07:56:05 +00:00 root	2512 2	ZIP TAR.GZ
gg/add-phixtral 9998ecd191 · llama : add phixtral support (wip) · Updated 2024-01-13 12:24:07 +00:00 root	2542 1	ZIP TAR.GZ
gg/update-phi2-convert 1fb563ebdc · py : try to fix flake stuff · Updated 2024-01-13 11:42:35 +00:00 root	2536 2	ZIP TAR.GZ
ik/iq2_2.31bpw 9bfcb16fd3 · Add llama enum for IQ2_XS · Updated 2024-01-11 16:24:12 +00:00 root	2585 11	ZIP TAR.GZ
gg/server-infill-empty-prompt-4027 24096933b0 · server : try to fix infill when prompt is empty · Updated 2024-01-09 09:27:29 +00:00 root	2587 1	ZIP TAR.GZ

... 10 11 12 13 14 ...

Default Branch

Branches