llama.cpp/ggml
Eve 3407364776
Q6_K AVX improvements (#10118)
* q6_k instruction reordering attempt

* better subtract method

* should be theoretically faster

small improvement with shuffle lut, likely because all loads are already done at that stage

* optimize bit fiddling

* handle -32 offset separately. bsums exists for a reason!

* use shift

* Update ggml-quants.c

* have to update ci macos version to 13 as 12 doesnt work now. 13 is still x86
2024-11-04 23:06:31 +01:00
..
cmake llama : reorganize source code + improve CMake (#8006) 2024-06-26 18:33:02 +03:00
include ggml : move CPU backend to a separate file (#10144) 2024-11-03 19:34:08 +01:00
src Q6_K AVX improvements (#10118) 2024-11-04 23:06:31 +01:00
.gitignore vulkan : cmake integration (#8119) 2024-07-13 18:12:39 +02:00
CMakeLists.txt add amx kernel for gemm (#8998) 2024-10-18 13:34:36 +08:00