Default Branch

master
Some checks are pending
Python Type-Check / pyright type-check (push) Waiting to run
flake8 Lint / Lint (push) Waiting to run

9ba399dfa7 · server : add support for "encoding_format": "base64" to the */embeddings endpoints (#10967) · Updated 2024-12-24 20:33:04 +00:00

Branches

2fcdf869cd · batched-bench : add mmq CLI arg · Updated 2023-10-11 16:42:33 +00:00    root

2971
7

ee7456926e · ggml-alloc : fix assert in debug builds · Updated 2023-10-09 12:33:12 +00:00    root

2980
1

ee268b5446 · llama : no longer perform uninitialized access to the KV cache · Updated 2023-10-08 08:49:38 +00:00    root

2987
5

acead654d2 · Merge branch 'master' into fix-refact · Updated 2023-10-08 08:25:16 +00:00    root

2987
4

6b9554a740 · metal : print more GPU info + disable mul_mm for MTLGPUFamiliy < Apple7 · Updated 2023-10-08 06:55:13 +00:00    root

2994
5

ba44776dc2 · bump version · Updated 2023-10-07 18:47:48 +00:00    root

2993
6

5ab6c2132a · server-parallel : add "--reverse-prompt" + compiler warning fixes · Updated 2023-10-06 11:32:19 +00:00    root

3006
4

5418932b71 · llama : fix comments for llama_kv_cache API · Updated 2023-10-03 18:01:52 +00:00    root

3031
5

c5650ed470 · server : avoid context swaps by shifting the KV cache · Updated 2023-09-28 16:03:36 +00:00    root

3055
57

72e7ef4e53 · simple : fixes · Updated 2023-09-26 21:19:36 +00:00    root

3081
48

784d14ed31 · llama : store non-RoPEd K cache (WIP) · Updated 2023-09-17 20:43:07 +00:00    root

3093
5

92a4f86879 · llama : make starcoder graph build more consistent with others · Updated 2023-09-15 14:57:10 +00:00    root

3103
20

e7e7b11455 · llama : remove experimental stuff · Updated 2023-09-14 19:52:01 +00:00    root

3115
3

2f689dee06 · metal : minor · Updated 2023-09-07 12:33:21 +00:00    root

3148
5

30ac7a4117 · gitignore : metal · Updated 2023-09-04 19:23:16 +00:00    root

3160
12

f3a84b2e0d · llama : better express the KV cache dependencies in the graph · Updated 2023-09-04 18:44:48 +00:00    root

3160
5

c79d130f74 · make : fix speculative build · Updated 2023-09-04 12:50:04 +00:00    root

3161
9

847896aba7 · speculative : add --draft CLI arg · Updated 2023-09-03 10:51:07 +00:00    root

3167
3

8c2b881281 · cuda : poc for norm quants (only -b 1 works) · Updated 2023-08-30 18:42:28 +00:00    root

3208
3

b4e70822f6 · metal : add poc for normalized Q4_0 and Q4_1 · Updated 2023-08-30 15:47:16 +00:00    root

3208
7