Default Branch

master
Some checks are pending
Python check requirements.txt / check-requirements (push) Waiting to run
flake8 Lint / Lint (push) Waiting to run
Python Type-Check / pyright type-check (push) Waiting to run

c05e8c9934 · gguf-py: fixed local detection of gguf package (#11180) · Updated 2025-01-11 09:42:31 +00:00

Branches

f55b647300 · llama : minor indentation during tensor loading · Updated 2024-07-04 16:34:04 +00:00    root

1159
16

dcab343f2f · use 1 seq for kl_divergence · Updated 2024-07-03 14:22:58 +00:00    root

1174
2

703764a382 · convert : use non-fast T5 tokenizer · Updated 2024-07-02 16:29:26 +00:00    root

1216
10

d4a1923d4e · minor : remove parentheses · Updated 2024-07-01 11:45:55 +00:00    root

1194
2

51f0bd50a1 · Remove custom pre attention scaling and use computed value instead. · Updated 2024-06-30 03:02:50 +00:00    root

1197
10

712e4d9450 · Generate full token count during warm up · Updated 2024-06-28 12:29:00 +00:00    root

1200
1

65f9293d14 · devops : remove clblast + LLAMA_CUDA -> GGML_CUDA · Updated 2024-06-26 16:17:26 +00:00    root

1226
1

1e6e363d7f · test zero max buffer size · Updated 2024-06-26 15:11:09 +00:00    root

1227
1

ff0aa3abd1 · fix part of mul_mat_id · Updated 2024-06-21 03:38:00 +00:00    root

1270
1

f3974cabac · all matrix multiplication backend · Updated 2024-06-18 11:18:26 +00:00    root

1311
1

ce6e28cc23 · Update ggml-sycl.cpp · Updated 2024-06-18 08:57:14 +00:00    root

1322
6

ef79941ac9 · llama : disable FA if KV head size do not match · Updated 2024-06-17 16:20:24 +00:00    root

1292
1

a235b7c532 · Vectorize q load · Updated 2024-06-17 09:30:40 +00:00    root

1322
11

98f948b9d0 · unicode : avoid char32_t · Updated 2024-06-16 10:18:46 +00:00    root

1307
1

28f7a4d028 · ggml : fix handling of zero blocks in IQ quants · Updated 2024-06-16 07:41:53 +00:00    root

1308
1

e9f2abfc8c · bitnet : pad tensors to 256 · Updated 2024-06-15 16:01:03 +00:00    root

1326
25

34bdbed481 · rpc : fix load/store misaligned addresses · Updated 2024-06-15 11:39:20 +00:00    root

1310
1

eaf34ba0cd · metal : utilize max shared memory for mul_mat_id · Updated 2024-06-14 10:02:25 +00:00    root

1317
1

18133cab40 · Revert "use the correct SYCL context for host USM allocations" · Updated 2024-06-13 11:08:27 +00:00    root

1322
4

46325233c9 · Revert 7777 · Updated 2024-06-12 15:22:55 +00:00    root

1322
1