llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2024-12-25 02:44:36 +00:00

master

Python Type-Check / pyright type-check (push) Waiting to run

Details

flake8 Lint / Lint (push) Waiting to run

Details

9ba399dfa7 · server : add support for "encoding_format": "base64" to the */embeddings endpoints (#10967) · Updated 2024-12-24 20:33:04 +00:00

test-mmv 29fe516913 · wip · Updated 2023-10-31 16:36:37 +00:00 root	2890 1	ZIP TAR.GZ
deploy dab42893c9 · scripts : working curl pipe · Updated 2023-10-31 15:03:56 +00:00 root	2890 3	ZIP TAR.GZ
llama-refactor-norm 7923b70cb8 · llama : add llm_build_inp_embd helper · Updated 2023-10-31 14:43:08 +00:00 root	2895 37	ZIP TAR.GZ
ggml-impl 4b3cb98d46 · ggml-impl : move extern "C" to start of file · Updated 2023-10-30 17:05:58 +00:00 root	2891 7	ZIP TAR.GZ
lto bc28aaa8c2 · make : use -lfto=auto to avoid warnings and maintain perf · Updated 2023-10-30 14:00:53 +00:00 root	2891 5	ZIP TAR.GZ
scratch 15267192c0 · llama : refactor tensor offloading as callback · Updated 2023-10-29 11:04:36 +00:00 root	2895 15	ZIP TAR.GZ
ggml-quants 8a86b95e87 · quantize : --pure option for disabling k-quant mixtures · Updated 2023-10-28 20:37:03 +00:00 root	2896 3	ZIP TAR.GZ
apply-3585 de7e0912b6 · convert : ignore tokens if their IDs are within [0, vocab_size) · Updated 2023-10-28 12:01:36 +00:00 root	2899 1	ZIP TAR.GZ
sampling-greedy-with-probs bbfc62ac2f · sampling : temp == 0.0 -> no probs, temp < 0.0 -> probs · Updated 2023-10-28 11:04:57 +00:00 root	2907 3	ZIP TAR.GZ
cuda-multi-gpu cd3e20fb50 · cuda : fix multi-gpu with tensor cores · Updated 2023-10-27 20:11:50 +00:00 root	2906 3	ZIP TAR.GZ
cuda-quantum-batch 49af767fad · build : add compile option to force use of MMQ kernels · Updated 2023-10-27 10:21:04 +00:00 root	2908 7	ZIP TAR.GZ
cuda-batched-gemm d798a17c34 · cuda : add TODO for calling cublas from kernel + using mem pool · Updated 2023-10-24 13:33:24 +00:00 root	2922 10	ZIP TAR.GZ
cuda-batched-gemm-deq 6966474928 · cuda : play with faster Q4_0 dequantization · Updated 2023-10-24 07:29:40 +00:00 root	2922 8	ZIP TAR.GZ
upd-issue-templates b9bb4cbe86 · Separate bug and enhancement template + no default title · Updated 2023-10-23 15:59:11 +00:00 root	2922 1	ZIP TAR.GZ
server-rev c0f4d54870 · server : add comment about changing slot_state to bool · Updated 2023-10-22 19:24:39 +00:00 root	2928 72	ZIP TAR.GZ
perf-study cb79f8a2d8 · llama : add SKIP_KQ_KQV option · Updated 2023-10-22 06:58:29 +00:00 root	2928 3	ZIP TAR.GZ
sampling-refactor 56ba00b923 · sampling : hide prev behind API and apply #3661 · Updated 2023-10-20 15:53:27 +00:00 root	2931 6	ZIP TAR.GZ
speculative-tree ad2727d091 · Merge branch 'master' into speculative-tree · Updated 2023-10-18 07:50:58 +00:00 root	2942 18	ZIP TAR.GZ
llava-fix-offloading 932589c0ef · Honor -ngl option for Cuda offloading in llava · Updated 2023-10-14 00:12:10 +00:00 root	2956 1	ZIP TAR.GZ
rev-sampling 5261aee8d8 · sampling : one sequence per sampling context · Updated 2023-10-12 17:36:44 +00:00 root	2959 1	ZIP TAR.GZ

... 13 14 15 16 17 ...

Default Branch

Branches