mirror of
https://github.com/ggerganov/llama.cpp.git
synced 2024-11-15 07:19:53 +00:00
3bc7103d2e
This makes the weight buft detection in src/llama.cpp simpler. * convert : transpose Mamba-2 A, D and reshape SSM_NORM This breaks existing conversions of Mamba-2 models to avoid some reshapes. Not sure if it's a good idea, but it makes the graph slightly cleaner. * llama : more appropriate SSM_SCAN and SSM_CONV buft support checks |
||
---|---|---|
.. | ||
ggml-alloc.h | ||
ggml-amx.h | ||
ggml-backend.h | ||
ggml-blas.h | ||
ggml-cann.h | ||
ggml-cuda.h | ||
ggml-kompute.h | ||
ggml-metal.h | ||
ggml-rpc.h | ||
ggml-sycl.h | ||
ggml-vulkan.h | ||
ggml.h |