mirror of https://github.com/ggerganov/llama.cpp.git synced 2025-01-11 19:21:46 +00:00

History

Georgi Gerganov 574406dc7e ggml : add Q5_0 and Q5_1 quantization (#1187 ) * ggml : add Q5_0 quantization (cuBLAS only) * ggml : fix Q5_0 qh -> uint32_t * ggml : fix q5_0 histogram stats * ggml : q5_0 scalar dot product * ggml : q5_0 ARM NEON dot * ggml : q5_0 more efficient ARM NEON using uint64_t masks * ggml : rename Q5_0 -> Q5_1 * ggml : adding Q5_0 mode * quantize : add Q5_0 and Q5_1 to map * ggml : AVX2 optimizations for Q5_0, Q5_1 (#1195) --------- Co-authored-by: Stephan Walter <stephan@walter.name>		2023-04-26 23:14:13 +03:00
..
CMakeLists.txt	llama : fix linkage with mingw (#551 )	2023-03-28 21:23:09 +03:00
quantize.cpp	ggml : add Q5_0 and Q5_1 quantization (#1187 )	2023-04-26 23:14:13 +03:00
README.md	Overhaul the examples structure	2023-03-25 20:26:40 +02:00

quantize

TODO