mirror of
https://github.com/ggerganov/llama.cpp.git
synced 2024-12-26 11:24:35 +00:00
e11bd856d5
* CPU/CUDA: Gemma 2 FlashAttention support * apply logit_softcap to scale in kernel * disable logit softcapping tests on Metal * remove metal check |
||
---|---|---|
.. | ||
ggml-alloc.h | ||
ggml-backend.h | ||
ggml-blas.h | ||
ggml-cann.h | ||
ggml-cuda.h | ||
ggml-kompute.h | ||
ggml-metal.h | ||
ggml-rpc.h | ||
ggml-sycl.h | ||
ggml-vulkan.h | ||
ggml.h |