Default Branch

master
Some checks failed
Python Type-Check / pyright type-check (push) Has been cancelled
flake8 Lint / Lint (push) Has been cancelled

9ba399dfa7 · server : add support for "encoding_format": "base64" to the */embeddings endpoints (#10967) · Updated 2024-12-24 20:33:04 +00:00

Branches

072c56fcdb · metal : fix the fix · Updated 2024-03-22 07:58:22 +00:00    root

1898
3

a710d58d88 · Try fix quantized k-cache on ROCm · Updated 2024-03-21 18:18:50 +00:00    root

1903
1

68e4fed4d9 · Now fix test-quantize-fns · Updated 2024-03-21 11:18:03 +00:00    root

1910
3

9a424a3872 · server : fix tests expecting old repeat penalty · Updated 2024-03-19 21:12:28 +00:00    root

1926
1

0a9bc301ac · control-vectors : minor code style updates · Updated 2024-03-14 14:43:37 +00:00    root

1965
3

abf0afd0d6 · ci : fix iOS builds to use embedded library · Updated 2024-03-14 09:34:22 +00:00    root

1983
4

9f805264dc · Attempt 2 · Updated 2024-03-12 16:40:13 +00:00    root

1983
3

5440a127c7 · iq1_s: fix dequantize on the CPU · Updated 2024-03-11 13:17:28 +00:00    root

1996
6

76be02aebc · sycl : fix grid type · Updated 2024-03-11 13:17:08 +00:00    root

1991
3

989e15b3c1 · Merge branch 'master' into sycl_q3s_q1s · Updated 2024-03-11 03:11:35 +00:00    root

1998
9

b54afce9f4 · mostly style fixes; fix KQ_mask comment · Updated 2024-03-09 19:03:46 +00:00    root

2041
10

0ba20ed97a · llama : compute BERT graph with F16 K, V · Updated 2024-03-07 14:33:30 +00:00    root

2031
1

b5b0270372 · Revert "[SYCL] fix error when set main gpu to non-zero (#5901)" · Updated 2024-03-07 09:11:18 +00:00    root

2035
1

31cecc8734 · iq3_s_mult_shuffle: use lookup table on Metal · Updated 2024-03-05 08:19:44 +00:00    root

2109
24

4ec0e9abbf · wip · Updated 2024-03-04 15:07:12 +00:00    root

2058
5

eb0bf32caf · server: tests: schedule slow dispatch only on release or on demand · Updated 2024-03-02 22:18:31 +00:00    root

2070
1

0b673ca187 · s/_MODEL_CLASSES/_model_classes/ · Updated 2024-03-02 17:14:37 +00:00    root

2083
3

d4dfc250cc · Fix ARM_NEON · Updated 2024-03-02 08:12:02 +00:00    root

2088
7

f8ab539190 · convert : update help string · Updated 2024-03-01 17:29:34 +00:00    root

2086
3

9862d59c05 · llama : change starcoder2 rope type · Updated 2024-03-01 13:10:31 +00:00    root

2095
8