mirror of
https://github.com/ggerganov/llama.cpp.git
synced 2025-01-12 19:50:17 +00:00
5d500e8ccf
* ci : add 7B CUDA tests ggml-ci * ci : add Q2_K to the tests * ci : bump CUDA ppl chunks ggml-ci * ci : increase CUDA TG len + add --ignore-eos * ci : reduce CUDA ppl cunks down to 4 to save time |
||
---|---|---|
.. | ||
README.md | ||
run.sh |
CI
In addition to Github Actions llama.cpp
uses a custom CI framework:
https://github.com/ggml-org/ci
It monitors the master
branch for new commits and runs the
ci/run.sh script on dedicated cloud instances. This allows us
to execute heavier workloads compared to just using Github Actions. Also with time, the cloud instances will be scaled
to cover various hardware architectures, including GPU and Apple Silicon instances.
Collaborators can optionally trigger the CI run by adding the ggml-ci
keyword to their commit message.
Only the branches of this repo are monitored for this keyword.
It is a good practice, before publishing changes to execute the full CI locally on your machine:
mkdir tmp
# CPU-only build
bash ./ci/run.sh ./tmp/results ./tmp/mnt
# with CUDA support
GG_BUILD_CUDA=1 bash ./ci/run.sh ./tmp/results ./tmp/mnt