llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2025-01-06 00:34:35 +00:00

History

Eve 3407364776 Q6_K AVX improvements (#10118 ) * q6_k instruction reordering attempt * better subtract method * should be theoretically faster small improvement with shuffle lut, likely because all loads are already done at that stage * optimize bit fiddling * handle -32 offset separately. bsums exists for a reason! * use shift * Update ggml-quants.c * have to update ci macos version to 13 as 12 doesnt work now. 13 is still x86		2024-11-04 23:06:31 +01:00
..
cmake	llama : reorganize source code + improve CMake (#8006 )	2024-06-26 18:33:02 +03:00
include	ggml : move CPU backend to a separate file (#10144 )	2024-11-03 19:34:08 +01:00
src	Q6_K AVX improvements (#10118 )	2024-11-04 23:06:31 +01:00
.gitignore	vulkan : cmake integration (#8119 )	2024-07-13 18:12:39 +02:00
CMakeLists.txt	add amx kernel for gemm (#8998 )	2024-10-18 13:34:36 +08:00