llama.cpp/ggml
Johannes Gäßler 69c487f4ed
CUDA: MMQ code deduplication + iquant support (#8495)
* CUDA: MMQ code deduplication + iquant support

* 1 less parallel job for CI build
2024-07-20 22:25:26 +02:00
..
cmake llama : reorganize source code + improve CMake (#8006) 2024-06-26 18:33:02 +03:00
include CUDA: fix partial offloading for ne0 % 256 != 0 (#8572) 2024-07-18 23:48:47 +02:00
src CUDA: MMQ code deduplication + iquant support (#8495) 2024-07-20 22:25:26 +02:00
.gitignore vulkan : cmake integration (#8119) 2024-07-13 18:12:39 +02:00
CMakeLists.txt cmake : install all ggml public headers (#8480) 2024-07-18 17:47:12 +03:00