Default Branch

30caac3a68 · llama : the WPM vocabs use the CLS token as BOS (#10930) · Updated 2024-12-24 07:44:20 +00:00

Branches

57349e1db3 · llama : allow overrides for tokenizer flags · Updated 2024-07-21 11:42:15 +00:00    root

957
1

1932a1b871 · gguf-py : do not use title case for naming convention · Updated 2024-07-20 20:55:06 +00:00    root

965
5

c8ee1bccdd · Fix Vulkan matmul tests compile errors · Updated 2024-07-20 06:01:18 +00:00    root

965
1

50d1a035f0 · convert_hf : fix Gemma v1 not setting BOS and EOS tokens · Updated 2024-07-20 02:46:35 +00:00    root

965
2

38061254b9 · gguf : handle null name during init · Updated 2024-07-19 10:45:00 +00:00    root

970
1

f6ea7a093c · llama : change fallback type IQ4_NL -> Q4_0 · Updated 2024-07-16 07:00:57 +00:00    root

986
1

b971122eb1 · convert_hf : fix memory leak in lazy MoE conversion · Updated 2024-07-16 01:11:44 +00:00    root

988
3

f89eaa921e · pydantic : fix Python 3.9 and 3.10 support · Updated 2024-07-14 01:52:45 +00:00    root

1002
2

59ce85318a · test-tokenizer-random : reduce potential confilcts with #8379 · Updated 2024-07-13 05:56:05 +00:00    root

1020
14

ba06b2deb7 · tokenize : add --no-parse-special option · Updated 2024-07-10 22:06:25 +00:00    root

1020
1

117f7adbd9 · ggml : remove K_QUANTS_PER_ITERATION (#8306) · Updated 2024-07-10 12:23:12 +00:00    root

1083
7

aaf7bc89e4 · Merge branch 'master' into compilade/gguf-py-fix-old-numpy · Updated 2024-07-09 04:10:06 +00:00    root

1038
2

86ccd30983 · ci : only show warnings and errors in python type-check · Updated 2024-07-07 18:10:42 +00:00    root

1053
10

a44f22e7d3 · py : use cpu-only torch in requirements.txt · Updated 2024-07-06 15:18:03 +00:00    root

1063
1

f55b647300 · llama : minor indentation during tensor loading · Updated 2024-07-04 16:34:04 +00:00    root

1085
16

dcab343f2f · use 1 seq for kl_divergence · Updated 2024-07-03 14:22:58 +00:00    root

1100
2

703764a382 · convert : use non-fast T5 tokenizer · Updated 2024-07-02 16:29:26 +00:00    root

1142
10

d4a1923d4e · minor : remove parentheses · Updated 2024-07-01 11:45:55 +00:00    root

1120
2

51f0bd50a1 · Remove custom pre attention scaling and use computed value instead. · Updated 2024-06-30 03:02:50 +00:00    root

1123
10

712e4d9450 · Generate full token count during warm up · Updated 2024-06-28 12:29:00 +00:00    root

1126
1