llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2024-12-26 11:24:35 +00:00

History

Jeff Bolz cc98896db8 vulkan: optimize and reenable split_k (#10637 ) Use vector loads when possible in mul_mat_split_k_reduce. Use split_k when there aren't enough workgroups to fill the shaders.		2024-12-03 20:29:54 +01:00
..
include	ggml : move AMX to the CPU backend (#10570 )	2024-11-29 21:54:58 +01:00
src	vulkan: optimize and reenable split_k (#10637 )	2024-12-03 20:29:54 +01:00
.gitignore	vulkan : cmake integration (#8119 )	2024-07-13 18:12:39 +02:00
CMakeLists.txt	ggml : automatic selection of best CPU backend (#10606 )	2024-12-01 16:12:41 +01:00