llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2024-12-25 02:44:36 +00:00

History

Johannes Gäßler cb5fad4c6c CUDA: refactor and optimize IQ MMVQ (#8215 ) * CUDA: refactor and optimize IQ MMVQ * uint -> uint32_t * __dp4a -> ggml_cuda_dp4a * remove MIN_CC_DP4A checks * change default * try CI fix		2024-07-01 20:39:06 +02:00
..
cmake	llama : reorganize source code + improve CMake (#8006 )	2024-06-26 18:33:02 +03:00
include	llama : reorganize source code + improve CMake (#8006 )	2024-06-26 18:33:02 +03:00
src	CUDA: refactor and optimize IQ MMVQ (#8215 )	2024-07-01 20:39:06 +02:00
CMakeLists.txt	ggml : add GGML_CUDA_USE_GRAPHS option, restore GGML_CUDA_FORCE_CUBLAS (cmake) (#8140 )	2024-06-26 21:34:14 +02:00
ggml_vk_generate_shaders.py	llama : reorganize source code + improve CMake (#8006 )	2024-06-26 18:33:02 +03:00