Block a user
root
synced and deleted reference 2024-09-22 21:16:20 +00:00
refs/tags/refs/pull/9589/merge
at root/llama.cpp from mirror
root
synced new reference gg/metal-fa-f32-qk to root/llama.cpp from mirror
2024-09-22 21:16:20 +00:00
c35e586ea5
musa: enable building fat binaries, enable unified memory, and disable Flash Attention on QY1 (MTT S80) (#9526)
912c331d3d
Fix merge error in #9454 (#9589)
root
synced commits to refs/pull/9034/merge at root/llama.cpp from mirror
2024-09-22 21:16:20 +00:00
66bb682bea
Merge
ccb45186d0
into 912c331d3d
912c331d3d
Fix merge error in #9454 (#9589)
a5b57b08ce
CUDA: enable Gemma FA for HIP/Pascal (#9581)
ecd5d6b65b
llama: remove redundant loop when constructing ubatch (#9574)
2a63caaa69
RWKV v6: RWKV_WKV op CUDA implementation (#9454)
root
synced commits to refs/pull/9058/merge at root/llama.cpp from mirror
2024-09-22 21:16:20 +00:00
bd10b72e7d
Merge
fc6abde7aa
into c35e586ea5
c35e586ea5
musa: enable building fat binaries, enable unified memory, and disable Flash Attention on QY1 (MTT S80) (#9526)
912c331d3d
Fix merge error in #9454 (#9589)
root
synced commits to refs/pull/9526/merge at root/llama.cpp from mirror
2024-09-22 13:06:20 +00:00
4c30de9615
Merge
0fb0b4eab3
into a5b57b08ce
0fb0b4eab3
mtgpu: map cublasOperation_t to mublasOperation_t (sync code to latest)
a3ad2c9971
mtgpu: enable unified memory
43ff5f36c2
mtgpu: disable flash attention on qy1 (MTT S80); disable q3_k and mul_mat_batched_cublas
e40b33dcad
mtgpu: add mp_21 support
root
synced commits to refs/pull/9532/merge at root/llama.cpp from mirror
2024-09-22 13:06:20 +00:00
3904dde7a2
Merge
a829583c97
into a5b57b08ce
a5b57b08ce
CUDA: enable Gemma FA for HIP/Pascal (#9581)
root
synced commits to refs/pull/9541/merge at root/llama.cpp from mirror
2024-09-22 13:06:20 +00:00
8660d72920
Merge
c42ec2f8bb
into a5b57b08ce
a5b57b08ce
CUDA: enable Gemma FA for HIP/Pascal (#9581)
root
synced commits to refs/pull/9544/merge at root/llama.cpp from mirror
2024-09-22 13:06:20 +00:00
7fca5a04f3
Merge
4af076b494
into a5b57b08ce
a5b57b08ce
CUDA: enable Gemma FA for HIP/Pascal (#9581)
c4d6f343d4
cuda: add q8_0->f32 cpy operation
a5b57b08ce
CUDA: enable Gemma FA for HIP/Pascal (#9581)
ecd5d6b65b
llama: remove redundant loop when constructing ubatch (#9574)
2a63caaa69
RWKV v6: RWKV_WKV op CUDA implementation (#9454)
d09770cae7
ggml-alloc : fix list of allocated tensors with GGML_ALLOCATOR_DEBUG (#9573)
root
synced commits to refs/pull/9571/merge at root/llama.cpp from mirror
2024-09-22 13:06:20 +00:00
1c73f3bd44
Merge
c4d6f343d4
into a5b57b08ce
c4d6f343d4
cuda: add q8_0->f32 cpy operation
a5b57b08ce
CUDA: enable Gemma FA for HIP/Pascal (#9581)
root
synced commits to refs/pull/9579/merge at root/llama.cpp from mirror
2024-09-22 13:06:20 +00:00
eea1e6e277
Merge
33b692934f
into a5b57b08ce
a5b57b08ce
CUDA: enable Gemma FA for HIP/Pascal (#9581)
root
synced and deleted reference 2024-09-22 13:06:19 +00:00
refs/tags/refs/pull/9581/merge
at root/llama.cpp from mirror
root
synced commits to refs/pull/8837/merge at root/llama.cpp from mirror
2024-09-22 13:06:19 +00:00
a49798a0fb
Merge
02c75452c1
into a5b57b08ce
a5b57b08ce
CUDA: enable Gemma FA for HIP/Pascal (#9581)
ecd5d6b65b
llama: remove redundant loop when constructing ubatch (#9574)
2a63caaa69
RWKV v6: RWKV_WKV op CUDA implementation (#9454)