llama.cpp/ggml
Jeff Bolz a91a41364b
vulkan: optimize coopmat2 dequant functions (#10855)
Change the code to do 16b loads when possible and extract the appropriate
component late, so the code is effectively decoding a pair of elements and
then selecting one. This can allow more commoning to happen in the compiler
when neighboring elements are loaded.
2024-12-21 08:04:45 +01:00
..
include tts : add OuteTTS support (#10784) 2024-12-18 19:27:21 +02:00
src vulkan: optimize coopmat2 dequant functions (#10855) 2024-12-21 08:04:45 +01:00
.gitignore vulkan : cmake integration (#8119) 2024-07-13 18:12:39 +02:00
CMakeLists.txt ggml : fix arm build (#10890) 2024-12-18 23:21:42 +01:00