Default Branch

master
Some checks failed
Python Type-Check / pyright type-check (push) Has been cancelled
flake8 Lint / Lint (push) Has been cancelled

9ba399dfa7 · server : add support for "encoding_format": "base64" to the */embeddings endpoints (#10967) · Updated 2024-12-24 20:33:04 +00:00

Branches

124e4dced2 · Update · Updated 2024-04-22 09:42:32 +00:00    root

1725
2

3750706962 · llama : add llama_token_is_eog() · Updated 2024-04-20 13:52:03 +00:00    root

1690
4

f02ea667c1 · ggml : temporary disable llamafile sgemm until fixed · Updated 2024-04-16 19:45:56 +00:00    root

1699
1

eedd42e376 · KV Cache defrag hash overflow - TMP Fix by @slaren · Updated 2024-04-16 08:24:34 +00:00    root

1702
1

8b495540fa · imatrix : remove invalid assert · Updated 2024-04-12 08:45:12 +00:00    root

1727
1

072e0a4d3b · scipts : add LICENSE and gen-authors.sh to sync · Updated 2024-04-09 06:19:33 +00:00    root

1803
3

a37696d4f1 · speculative : more robust tokenizer comparison · Updated 2024-04-04 22:28:13 +00:00    root

1773
9

4c190ba676 · cuda : reduce registers · Updated 2024-03-28 19:17:08 +00:00    root

1815
77

64b7d85891 · llama : fix command-r inference · Updated 2024-03-28 10:22:24 +00:00    root

1820
1

6be02b5969 · cuda : fix build · Updated 2024-03-27 08:31:52 +00:00    root

1837
72

87a6088ffe · rename unicodedata.{cpp,h} to unicode-data.{cpp,h} · Updated 2024-03-26 14:52:33 +00:00    root

1852
7

9c5fd6be14 · minor : spacing · Updated 2024-03-26 12:09:02 +00:00    root

1850
2

6f20e2672f · Include IQ2_XXS and IQ2_XS in teet-quantize-fns · Updated 2024-03-25 17:01:20 +00:00    root

1854
1

210e469114 · cuda : fix LLAMA_CUDA_F16 build · Updated 2024-03-25 14:31:10 +00:00    root

1856
1

d05c13b3b9 · llama : fix BPE LF token on MSVC · Updated 2024-03-23 18:03:16 +00:00    root

1876
3

3a468e6f9f · llama : fix type of KQ_mask and KQ_pos · Updated 2024-03-22 15:12:17 +00:00    root

1881
68

0e826d12a5 · quantize: be able to specify the token embedding tensor type · Updated 2024-03-22 14:27:34 +00:00    root

1890
2

8c3d5b5a79 · common : remove defaults · Updated 2024-03-22 13:33:24 +00:00    root

1893
2

12aa74ba7d · minor : spacing · Updated 2024-03-22 13:24:57 +00:00    root

2118
6

31f2d03f1b · server : enable continuous batching by default · Updated 2024-03-22 09:16:43 +00:00    root

1897
1