llama.cpp/ci
Georgi Gerganov ba69bbc84c
imatrix : offload to GPU support (#4957)
* backend : add eval callback

ggml-ci

* backend : group nodes in a single compute when user don't need them

* backend : clean-up the implementation

ggml-ci

* simple : do not perform tensor data copy if not needed

* simple : fix

* imatrix : offload to GPU support

* imatrix : fix ggml_mul_mat_id hanlding

ggml-ci

* ci : add imatrix test

ggml-ci

* ci : rearrange output

ggml-ci
2024-01-17 18:46:30 +02:00
..
README.md ci : add 7B CUDA tests (#2319) 2023-07-22 11:48:22 +03:00
run.sh imatrix : offload to GPU support (#4957) 2024-01-17 18:46:30 +02:00

CI

In addition to Github Actions llama.cpp uses a custom CI framework:

https://github.com/ggml-org/ci

It monitors the master branch for new commits and runs the ci/run.sh script on dedicated cloud instances. This allows us to execute heavier workloads compared to just using Github Actions. Also with time, the cloud instances will be scaled to cover various hardware architectures, including GPU and Apple Silicon instances.

Collaborators can optionally trigger the CI run by adding the ggml-ci keyword to their commit message. Only the branches of this repo are monitored for this keyword.

It is a good practice, before publishing changes to execute the full CI locally on your machine:

mkdir tmp

# CPU-only build
bash ./ci/run.sh ./tmp/results ./tmp/mnt

# with CUDA support
GG_BUILD_CUDA=1 bash ./ci/run.sh ./tmp/results ./tmp/mnt