Commit Graph

4 Commits

Author SHA1 Message Date
Francis Couture-Harpin
7ef4254a92 ggml-quants : faster 1.625 bpw AVX2 vec_dot
Not using a lookup table anymore makes it match q4_0 speed.

* gguf-py : fix formatting

* llama : remove spaces on empty line
2024-06-27 02:06:28 -04:00
Francis Couture-Harpin
bd807499f7 ggml-quants : 1.625 bpw ternary packing for BitNet 1.58b 2024-06-27 02:06:22 -04:00
compilade
b83bab15a5
gguf-py : fix and simplify quantized shape round-trip (#7483)
* gguf-py : fix and simplify quantized shape round-trip

* gguf-py : remove unused import
2024-05-25 11:11:48 +10:00
compilade
ee52225067
convert-hf : support direct Q8_0 conversion (#7234)
* convert-hf : support q8_0 conversion

* convert-hf : add missing ftype

This was messing with the checksums otherwise.

* convert-hf : add missing ftype to Baichuan and Xverse

I didn't notice these on my first pass.
2024-05-13 14:10:51 -04:00