llama.cpp/ggml
Jeff Bolz cc98896db8
vulkan: optimize and reenable split_k (#10637)
Use vector loads when possible in mul_mat_split_k_reduce. Use split_k
when there aren't enough workgroups to fill the shaders.
2024-12-03 20:29:54 +01:00
..
include ggml : move AMX to the CPU backend (#10570) 2024-11-29 21:54:58 +01:00
src vulkan: optimize and reenable split_k (#10637) 2024-12-03 20:29:54 +01:00
.gitignore vulkan : cmake integration (#8119) 2024-07-13 18:12:39 +02:00
CMakeLists.txt ggml : automatic selection of best CPU backend (#10606) 2024-12-01 16:12:41 +01:00