llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2025-01-11 03:01:45 +00:00

Author	SHA1	Message	Date
mqy	06b00827a0	bulk refactoring task profile and related to run CL GPU offloading. * removed ggml_task_backend, infavour of ggml_task_profile.runner and newly added id and name. * extracted mul_mat blas codes into ggml_compute_forward_mul_mat_blas, thus align with CUDA/CL a bit more and make it easier to fix profile and run tune. * rewrote task profile and update/add some cuda/cl codes, finnaly made CL GPU offloading work. * misc minor fix/update to tune, the data format was changed.	2023-06-18 14:27:56 +08:00
mqy	6b83a3e16f	try make CL run w/o tunning, but -ngl stucks no output. had to add task runer and profile id, many changes, see the f codes	2023-06-18 14:27:56 +08:00
mqy	48016f685c	bulk refactored task profile to support complete fallback; enable tune by default for ease of dev	2023-06-18 14:27:56 +08:00
mqy	213f133701	initial	2023-06-18 14:27:53 +08:00