mirror of
https://github.com/ggerganov/llama.cpp.git
synced 2025-01-11 11:11:46 +00:00
06b00827a0
* removed ggml_task_backend, infavour of ggml_task_profile.runner and newly added id and name. * extracted mul_mat blas codes into ggml_compute_forward_mul_mat_blas, thus align with CUDA/CL a bit more and make it easier to fix profile and run tune. * rewrote task profile and update/add some cuda/cl codes, finnaly made CL GPU offloading work. * misc minor fix/update to tune, the data format was changed. |
||
---|---|---|
.. | ||
.gitignore | ||
CMakeLists.txt | ||
test-double-float.c | ||
test-ggml-threading.c | ||
test-ggml-tune.c | ||
test-grad0.c | ||
test-opt.c | ||
test-quantize-fns.cpp | ||
test-quantize-perf.cpp | ||
test-sampling.cpp | ||
test-tokenizer-0.cpp |