llama.cpp/ggml
Johannes Gäßler cb5fad4c6c
CUDA: refactor and optimize IQ MMVQ (#8215)
* CUDA: refactor and optimize IQ MMVQ

* uint -> uint32_t

* __dp4a -> ggml_cuda_dp4a

* remove MIN_CC_DP4A checks

* change default

* try CI fix
2024-07-01 20:39:06 +02:00
..
cmake llama : reorganize source code + improve CMake (#8006) 2024-06-26 18:33:02 +03:00
include llama : reorganize source code + improve CMake (#8006) 2024-06-26 18:33:02 +03:00
src CUDA: refactor and optimize IQ MMVQ (#8215) 2024-07-01 20:39:06 +02:00
CMakeLists.txt ggml : add GGML_CUDA_USE_GRAPHS option, restore GGML_CUDA_FORCE_CUBLAS (cmake) (#8140) 2024-06-26 21:34:14 +02:00
ggml_vk_generate_shaders.py llama : reorganize source code + improve CMake (#8006) 2024-06-26 18:33:02 +03:00