llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2024-11-14 23:09:53 +00:00

History

Changyeon Kim 8f275a7c45 Some checks are pending Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/full-cuda.Dockerfile platforms:linux/amd64 tag:full-cuda]) (push) Waiting to run Details Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/full-musa.Dockerfile platforms:linux/amd64 tag:full-musa]) (push) Waiting to run Details Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/full.Dockerfile platforms:linux/amd64,linux/arm64 tag:full]) (push) Waiting to run Details Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-cli-cuda.Dockerfile platforms:linux/amd64 tag:light-cuda]) (push) Waiting to run Details Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-cli-intel.Dockerfile platforms:linux/amd64 tag:light-intel]) (push) Waiting to run Details Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-cli-musa.Dockerfile platforms:linux/amd64 tag:light-musa]) (push) Waiting to run Details Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-cli.Dockerfile platforms:linux/amd64,linux/arm64 tag:light]) (push) Waiting to run Details Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-server-cuda.Dockerfile platforms:linux/amd64 tag:server-cuda]) (push) Waiting to run Details Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-server-intel.Dockerfile platforms:linux/amd64 tag:server-intel]) (push) Waiting to run Details Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-server-musa.Dockerfile platforms:linux/amd64 tag:server-musa]) (push) Waiting to run Details Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-server.Dockerfile platforms:linux/amd64,linux/arm64 tag:server]) (push) Waiting to run Details Nix CI / nix-eval (macos-latest) (push) Waiting to run Details Nix CI / nix-eval (ubuntu-latest) (push) Waiting to run Details Nix CI / nix-build (macos-latest) (push) Waiting to run Details Nix CI / nix-build (ubuntu-latest) (push) Waiting to run Details flake8 Lint / Lint (push) Waiting to run Details Python Type-Check / pyright type-check (push) Waiting to run Details ggml: Add POOL2D OP for GPU acceleration to the Vulkan backend in the MobileVLM model. (#9763 ) * ggml: Add POOL2D OP for GPU ACC to the Vulkan. - The MobileVLM model now supports inference acceleration through GPU by utilizing the Vulkan backend. - A GGML_OP_POOL_2D shader has been added. (Pooling) - The encoding performance of the CLIP model improved from 2.8s on the CPU to 0.7s on the GPU. Signed-off-by: Changyeon Kim <cyzero.kim@samsung.com> * [fix] Correct the incorrect order of the parameters. fix casting to int. Signed-off-by: Changyeon Kim <cyzero.kim@samsung.com> --------- Signed-off-by: Changyeon Kim <cyzero.kim@samsung.com>		2024-10-29 09:52:56 +01:00
..
acc.comp	llava: Add ACC OP for GPU acceleration to the Vulkan backend in the LLAVA CLIP model. (#8984 )	2024-08-20 21:00:00 +02:00
add.comp	vulkan : implement Stable Diffusion operators (ggml/904)	2024-08-05 08:50:57 +03:00
argsort.comp	vulkan : argsort barriers must be under uniform control flow (ggml/951)	2024-09-29 21:15:37 +03:00
clamp.comp	vulkan : implement Stable Diffusion operators (ggml/904)	2024-08-05 08:50:57 +03:00
CMakeLists.txt	cmake : Link vulkan-shaders-gen with pthreads (#8835 )	2024-08-06 15:21:47 +02:00
concat.comp	Vulkan Optimizations and Fixes (#8959 )	2024-08-14 18:32:53 +02:00
copy.comp	vulkan : implement Stable Diffusion operators (ggml/904)	2024-08-05 08:50:57 +03:00
cos.comp	sync : ggml	2024-08-27 22:41:27 +03:00
dequant_f32.comp	llama : reorganize source code + improve CMake (#8006 )	2024-06-26 18:33:02 +03:00
dequant_funcs.comp	Vulkan IQ4_NL Support (#8613 )	2024-07-23 10:56:49 +02:00
dequant_head.comp	llama : reorganize source code + improve CMake (#8006 )	2024-06-26 18:33:02 +03:00
dequant_iq4_nl.comp	Vulkan IQ4_NL Support (#8613 )	2024-07-23 10:56:49 +02:00
dequant_q2_k.comp	llama : reorganize source code + improve CMake (#8006 )	2024-06-26 18:33:02 +03:00
dequant_q3_k.comp	llama : reorganize source code + improve CMake (#8006 )	2024-06-26 18:33:02 +03:00
dequant_q4_0.comp	Vulkan IQ4_NL Support (#8613 )	2024-07-23 10:56:49 +02:00
dequant_q4_1.comp	llama : reorganize source code + improve CMake (#8006 )	2024-06-26 18:33:02 +03:00
dequant_q4_k.comp	llama : reorganize source code + improve CMake (#8006 )	2024-06-26 18:33:02 +03:00
dequant_q5_0.comp	llama : reorganize source code + improve CMake (#8006 )	2024-06-26 18:33:02 +03:00
dequant_q5_1.comp	llama : reorganize source code + improve CMake (#8006 )	2024-06-26 18:33:02 +03:00
dequant_q5_k.comp	llama : reorganize source code + improve CMake (#8006 )	2024-06-26 18:33:02 +03:00
dequant_q6_k.comp	llama : reorganize source code + improve CMake (#8006 )	2024-06-26 18:33:02 +03:00
dequant_q8_0.comp	llama : reorganize source code + improve CMake (#8006 )	2024-06-26 18:33:02 +03:00
diag_mask_inf.comp	llama : reorganize source code + improve CMake (#8006 )	2024-06-26 18:33:02 +03:00
div.comp	vulkan : implement Stable Diffusion operators (ggml/904)	2024-08-05 08:50:57 +03:00
gelu_quick.comp	vulkan : implement Stable Diffusion operators (ggml/904)	2024-08-05 08:50:57 +03:00
gelu.comp	vulkan : implement Stable Diffusion operators (ggml/904)	2024-08-05 08:50:57 +03:00
generic_binary_head.comp	vulkan : implement Stable Diffusion operators (ggml/904)	2024-08-05 08:50:57 +03:00
generic_head.comp	llama : reorganize source code + improve CMake (#8006 )	2024-06-26 18:33:02 +03:00
generic_unary_head.comp	vulkan : implement Stable Diffusion operators (ggml/904)	2024-08-05 08:50:57 +03:00
get_rows_quant.comp	llama : reorganize source code + improve CMake (#8006 )	2024-06-26 18:33:02 +03:00
get_rows.comp	llama : reorganize source code + improve CMake (#8006 )	2024-06-26 18:33:02 +03:00
group_norm.comp	vulkan : implement Stable Diffusion operators (ggml/904)	2024-08-05 08:50:57 +03:00
im2col.comp	vulkan : implement Stable Diffusion operators (ggml/904)	2024-08-05 08:50:57 +03:00
leaky_relu.comp	vulkan : implement Stable Diffusion operators (ggml/904)	2024-08-05 08:50:57 +03:00
mul_mat_split_k_reduce.comp	llama : reorganize source code + improve CMake (#8006 )	2024-06-26 18:33:02 +03:00
mul_mat_vec_base.comp	llama : reorganize source code + improve CMake (#8006 )	2024-06-26 18:33:02 +03:00
mul_mat_vec_nc.comp	Vulkan Optimizations and Fixes (#8959 )	2024-08-14 18:32:53 +02:00
mul_mat_vec_p021.comp	Vulkan Optimizations and Fixes (#8959 )	2024-08-14 18:32:53 +02:00
mul_mat_vec_q2_k.comp	Vulkan Optimizations and Fixes (#8959 )	2024-08-14 18:32:53 +02:00
mul_mat_vec_q3_k.comp	Vulkan Optimizations and Fixes (#8959 )	2024-08-14 18:32:53 +02:00
mul_mat_vec_q4_k.comp	Vulkan Optimizations and Fixes (#8959 )	2024-08-14 18:32:53 +02:00
mul_mat_vec_q5_k.comp	Vulkan Optimizations and Fixes (#8959 )	2024-08-14 18:32:53 +02:00
mul_mat_vec_q6_k.comp	Vulkan Optimizations and Fixes (#8959 )	2024-08-14 18:32:53 +02:00
mul_mat_vec.comp	Vulkan Optimizations and Fixes (#8959 )	2024-08-14 18:32:53 +02:00
mul_mm.comp	Vulkan Optimizations and Fixes (#8959 )	2024-08-14 18:32:53 +02:00
mul.comp	vulkan : implement Stable Diffusion operators (ggml/904)	2024-08-05 08:50:57 +03:00
norm.comp	vulkan : implement Stable Diffusion operators (ggml/904)	2024-08-05 08:50:57 +03:00
pad.comp	vulkan : implement Stable Diffusion operators (ggml/904)	2024-08-05 08:50:57 +03:00
pool2d.comp	ggml: Add POOL2D OP for GPU acceleration to the Vulkan backend in the MobileVLM model. (#9763 )	2024-10-29 09:52:56 +01:00
relu.comp	vulkan : implement Stable Diffusion operators (ggml/904)	2024-08-05 08:50:57 +03:00
repeat.comp	Vulkan Optimizations and Fixes (#8959 )	2024-08-14 18:32:53 +02:00
rms_norm.comp	vulkan : implement Stable Diffusion operators (ggml/904)	2024-08-05 08:50:57 +03:00
rope_head.comp	llama : reorganize source code + improve CMake (#8006 )	2024-06-26 18:33:02 +03:00
rope_neox.comp	llama : reorganize source code + improve CMake (#8006 )	2024-06-26 18:33:02 +03:00
rope_norm.comp	llama : reorganize source code + improve CMake (#8006 )	2024-06-26 18:33:02 +03:00
scale.comp	vulkan : implement Stable Diffusion operators (ggml/904)	2024-08-05 08:50:57 +03:00
silu.comp	vulkan : implement Stable Diffusion operators (ggml/904)	2024-08-05 08:50:57 +03:00
sin.comp	sync : ggml	2024-08-27 22:41:27 +03:00
soft_max.comp	vulkan : implement Stable Diffusion operators (ggml/904)	2024-08-05 08:50:57 +03:00
square.comp	vulkan : implement Stable Diffusion operators (ggml/904)	2024-08-05 08:50:57 +03:00
sum_rows.comp	vulkan : implement Stable Diffusion operators (ggml/904)	2024-08-05 08:50:57 +03:00
tanh.comp	vulkan : implement Stable Diffusion operators (ggml/904)	2024-08-05 08:50:57 +03:00
timestep_embedding.comp	vulkan : implement Stable Diffusion operators (ggml/904)	2024-08-05 08:50:57 +03:00
types.comp	vulkan : implement Stable Diffusion operators (ggml/904)	2024-08-05 08:50:57 +03:00
upscale.comp	vulkan : implement Stable Diffusion operators (ggml/904)	2024-08-05 08:50:57 +03:00
vulkan-shaders-gen.cpp	ggml: Add POOL2D OP for GPU acceleration to the Vulkan backend in the MobileVLM model. (#9763 )	2024-10-29 09:52:56 +01:00