llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2024-12-25 02:44:36 +00:00

History

Kawrakow 147b17ac94 2-bit quantizations (#4897 ) * imatrix: load * imatrix: WIP * imatrix: Add Q2_K quantization * imatrix: also guard against Q2_K_S quantization without importance matrix * imatrix: guard even more against low-bit quantization misuse --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>		2024-01-14 09:45:56 +02:00
..
benchmark-matmult.cpp	2-bit quantizations (#4897 )	2024-01-14 09:45:56 +02:00
CMakeLists.txt	build : link against build info instead of compiling against it (#3879 )	2023-11-02 08:50:16 +02:00