llama.cpp/.devops/llama-cpp-cuda.srpm.spec

# SRPM for building from source and packaging an RPM for RPM-based distros.
# https://docs.fedoraproject.org/en-US/quick-docs/creating-rpm-packages
# Built and maintained by John Boero - boeroboy@gmail.com
# In honor of Seth Vidal https://www.redhat.com/it/blog/thank-you-seth-vidal

# Notes for llama.cpp:
# 1. Tags are currently based on hash - which will not sort asciibetically.
#    We need to declare standard versioning if people want to sort latest releases.
# 2. Builds for CUDA/OpenCL support are separate, with different depenedencies.
# 3. NVidia's developer repo must be enabled with nvcc, cublas, clblas, etc installed.
#    Example: https://developer.download.nvidia.com/compute/cuda/repos/fedora37/x86_64/cuda-fedora37.repo
# 4. OpenCL/CLBLAST support simply requires the ICD loader and basic opencl libraries.
#    It is up to the user to install the correct vendor-specific support.

Name:           llama.cpp-cuda
Version:        %( date "+%%Y%%m%%d" )
Release:        1%{?dist}
Summary:        CPU Inference of LLaMA model in pure C/C++ (no CUDA/OpenCL)
License:        MIT
Source0:        https://github.com/ggerganov/llama.cpp/archive/refs/heads/master.tar.gz
BuildRequires:  coreutils make gcc-c++ git cuda-toolkit
Requires:       cuda-toolkit
URL:            https://github.com/ggerganov/llama.cpp

%define debug_package %{nil}
%define source_date_epoch_from_changelog 0

%description
CPU inference for Meta's Lllama2 models using default options.

%prep
%setup -n llama.cpp-master

%build
make -j GGML_CUDA=1

%install
mkdir -p %{buildroot}%{_bindir}/
cp -p llama-cli %{buildroot}%{_bindir}/llama-cuda-cli
cp -p llama-server %{buildroot}%{_bindir}/llama-cuda-server
cp -p llama-simple %{buildroot}%{_bindir}/llama-cuda-simple

mkdir -p %{buildroot}/usr/lib/systemd/system
%{__cat} <<EOF  > %{buildroot}/usr/lib/systemd/system/llamacuda.service
[Unit]
Description=Llama.cpp server, CPU only (no GPU support in this build).
After=syslog.target network.target local-fs.target remote-fs.target nss-lookup.target

[Service]
Type=simple
EnvironmentFile=/etc/sysconfig/llama
ExecStart=/usr/bin/llama-cuda-server $LLAMA_ARGS
ExecReload=/bin/kill -s HUP $MAINPID
Restart=never

[Install]
WantedBy=default.target
EOF

mkdir -p %{buildroot}/etc/sysconfig
%{__cat} <<EOF  > %{buildroot}/etc/sysconfig/llama
LLAMA_ARGS="-m /opt/llama2/ggml-model-f32.bin"
EOF

%clean
rm -rf %{buildroot}
rm -rf %{_builddir}/*

%files
%{_bindir}/llama-cuda-cli
%{_bindir}/llama-cuda-server
%{_bindir}/llama-cuda-simple
/usr/lib/systemd/system/llamacuda.service
%config /etc/sysconfig/llama

%pre

%post

%preun
%postun

%changelog
devops : RPM Specs (#2723) * Create llama-cpp.srpm * Rename llama-cpp.srpm to llama-cpp.srpm.spec Correcting extension. * Tested spec success. * Update llama-cpp.srpm.spec * Create lamma-cpp-cublas.srpm.spec * Create lamma-cpp-clblast.srpm.spec * Update lamma-cpp-cublas.srpm.spec Added BuildRequires * Moved to devops dir 2023-08-23 14:28:22 +00:00			`# SRPM for building from source and packaging an RPM for RPM-based distros.`
Fedora build update (#6388) * fixed deprecated address * fixed deprecated address * fixed deprecated address * Added 'Apache-2.0' SPDX license identifier due to 'kompute.cc' submodule licensing. Explanation of licensing method: https://docs.fedoraproject.org/en-US/legal/spdx/#_and_expressions * Added 'Apache-2.0' SPDX license identifier due to 'kompute.cc' submodule licensing. Explanation of licensing method: https://docs.fedoraproject.org/en-US/legal/spdx/#_and_expressions * Added 'Apache-2.0' SPDX license identifier due to 'kompute.cc' submodule licensing. Explanation of licensing method: https://docs.fedoraproject.org/en-US/legal/spdx/#_and_expressions * reverted back to only the MIT license 2024-03-29 21:59:56 +00:00			`# https://docs.fedoraproject.org/en-US/quick-docs/creating-rpm-packages`
devops : RPM Specs (#2723) * Create llama-cpp.srpm * Rename llama-cpp.srpm to llama-cpp.srpm.spec Correcting extension. * Tested spec success. * Update llama-cpp.srpm.spec * Create lamma-cpp-cublas.srpm.spec * Create lamma-cpp-clblast.srpm.spec * Update lamma-cpp-cublas.srpm.spec Added BuildRequires * Moved to devops dir 2023-08-23 14:28:22 +00:00			`# Built and maintained by John Boero - boeroboy@gmail.com`
			`# In honor of Seth Vidal https://www.redhat.com/it/blog/thank-you-seth-vidal`

			`# Notes for llama.cpp:`
			`# 1. Tags are currently based on hash - which will not sort asciibetically.`
			`# We need to declare standard versioning if people want to sort latest releases.`
			`# 2. Builds for CUDA/OpenCL support are separate, with different depenedencies.`
			`# 3. NVidia's developer repo must be enabled with nvcc, cublas, clblas, etc installed.`
			`# Example: https://developer.download.nvidia.com/compute/cuda/repos/fedora37/x86_64/cuda-fedora37.repo`
			`# 4. OpenCL/CLBLAST support simply requires the ICD loader and basic opencl libraries.`
			`# It is up to the user to install the correct vendor-specific support.`

cuda : rename build flag to LLAMA_CUDA (#6299) 2024-03-26 00:16:01 +00:00			`Name: llama.cpp-cuda`
devops : added systemd units and set versioning to use date. (#2835) * Corrections and systemd units * Missing dependency clblast 2023-08-28 06:31:24 +00:00			`Version: %( date "+%%Y%%m%%d" )`
devops : RPM Specs (#2723) * Create llama-cpp.srpm * Rename llama-cpp.srpm to llama-cpp.srpm.spec Correcting extension. * Tested spec success. * Update llama-cpp.srpm.spec * Create lamma-cpp-cublas.srpm.spec * Create lamma-cpp-clblast.srpm.spec * Update lamma-cpp-cublas.srpm.spec Added BuildRequires * Moved to devops dir 2023-08-23 14:28:22 +00:00			`Release: 1%{?dist}`
			`Summary: CPU Inference of LLaMA model in pure C/C++ (no CUDA/OpenCL)`
			`License: MIT`
			`Source0: https://github.com/ggerganov/llama.cpp/archive/refs/heads/master.tar.gz`
			`BuildRequires: coreutils make gcc-c++ git cuda-toolkit`
			`Requires: cuda-toolkit`
			`URL: https://github.com/ggerganov/llama.cpp`

			`%define debug_package %{nil}`
			`%define source_date_epoch_from_changelog 0`

			`%description`
			`CPU inference for Meta's Lllama2 models using default options.`

			`%prep`
			`%setup -n llama.cpp-master`

			`%build`
devops : remove clblast + LLAMA_CUDA -> GGML_CUDA (#8139) ggml-ci 2024-06-26 16:32:07 +00:00			`make -j GGML_CUDA=1`
devops : RPM Specs (#2723) * Create llama-cpp.srpm * Rename llama-cpp.srpm to llama-cpp.srpm.spec Correcting extension. * Tested spec success. * Update llama-cpp.srpm.spec * Create lamma-cpp-cublas.srpm.spec * Create lamma-cpp-clblast.srpm.spec * Update lamma-cpp-cublas.srpm.spec Added BuildRequires * Moved to devops dir 2023-08-23 14:28:22 +00:00
			`%install`
			`mkdir -p %{buildroot}%{_bindir}/`
`build`: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809) * `main`/`server`: rename to `llama` / `llama-server` for consistency w/ homebrew * server: update refs -> llama-server gitignore llama-server * server: simplify nix package * main: update refs -> llama fix examples/main ref * main/server: fix targets * update more names * Update build.yml * rm accidentally checked in bins * update straggling refs * Update .gitignore * Update server-llm.sh * main: target name -> llama-cli * Prefix all example bins w/ llama- * fix main refs * rename {main->llama}-cmake-pkg binary * prefix more cmake targets w/ llama- * add/fix gbnf-validator subfolder to cmake * sort cmake example subdirs * rm bin files * fix llama-lookup-* Makefile rules * gitignore /llama-* * rename Dockerfiles * rename llama\|main -> llama-cli; consistent RPM bin prefixes * fix some missing -cli suffixes * rename dockerfile w/ llama-cli * rename(make): llama-baby-llama * update dockerfile refs * more llama-cli(.exe) * fix test-eval-callback * rename: llama-cli-cmake-pkg(.exe) * address gbnf-validator unused fread warning (switched to C++ / ifstream) * add two missing llama- prefixes * Updating docs for eval-callback binary to use new `llama-` prefix. * Updating a few lingering doc references for rename of main to llama-cli * Updating `run-with-preset.py` to use new binary names. Updating docs around `perplexity` binary rename. * Updating documentation references for lookup-merge and export-lora * Updating two small `main` references missed earlier in the finetune docs. * Update apps.nix * update grammar/README.md w/ new llama-* names * update llama-rpc-server bin name + doc * Revert "update llama-rpc-server bin name + doc" This reverts commit e474ef1df481fd8936cd7d098e3065d7de378930. * add hot topic notice to README.md * Update README.md * Update README.md * rename gguf-split & quantize bins refs in **/tests.sh --------- Co-authored-by: HanClinto <hanclinto@gmail.com> 2024-06-12 23:41:52 +00:00			`cp -p llama-cli %{buildroot}%{_bindir}/llama-cuda-cli`
			`cp -p llama-server %{buildroot}%{_bindir}/llama-cuda-server`
			`cp -p llama-simple %{buildroot}%{_bindir}/llama-cuda-simple`
devops : RPM Specs (#2723) * Create llama-cpp.srpm * Rename llama-cpp.srpm to llama-cpp.srpm.spec Correcting extension. * Tested spec success. * Update llama-cpp.srpm.spec * Create lamma-cpp-cublas.srpm.spec * Create lamma-cpp-clblast.srpm.spec * Update lamma-cpp-cublas.srpm.spec Added BuildRequires * Moved to devops dir 2023-08-23 14:28:22 +00:00
devops : added systemd units and set versioning to use date. (#2835) * Corrections and systemd units * Missing dependency clblast 2023-08-28 06:31:24 +00:00			`mkdir -p %{buildroot}/usr/lib/systemd/system`
cuda : rename build flag to LLAMA_CUDA (#6299) 2024-03-26 00:16:01 +00:00			`%{__cat} <<EOF > %{buildroot}/usr/lib/systemd/system/llamacuda.service`
devops : added systemd units and set versioning to use date. (#2835) * Corrections and systemd units * Missing dependency clblast 2023-08-28 06:31:24 +00:00			`[Unit]`
			`Description=Llama.cpp server, CPU only (no GPU support in this build).`
			`After=syslog.target network.target local-fs.target remote-fs.target nss-lookup.target`

			`[Service]`
			`Type=simple`
			`EnvironmentFile=/etc/sysconfig/llama`
`build`: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809) * `main`/`server`: rename to `llama` / `llama-server` for consistency w/ homebrew * server: update refs -> llama-server gitignore llama-server * server: simplify nix package * main: update refs -> llama fix examples/main ref * main/server: fix targets * update more names * Update build.yml * rm accidentally checked in bins * update straggling refs * Update .gitignore * Update server-llm.sh * main: target name -> llama-cli * Prefix all example bins w/ llama- * fix main refs * rename {main->llama}-cmake-pkg binary * prefix more cmake targets w/ llama- * add/fix gbnf-validator subfolder to cmake * sort cmake example subdirs * rm bin files * fix llama-lookup-* Makefile rules * gitignore /llama-* * rename Dockerfiles * rename llama\|main -> llama-cli; consistent RPM bin prefixes * fix some missing -cli suffixes * rename dockerfile w/ llama-cli * rename(make): llama-baby-llama * update dockerfile refs * more llama-cli(.exe) * fix test-eval-callback * rename: llama-cli-cmake-pkg(.exe) * address gbnf-validator unused fread warning (switched to C++ / ifstream) * add two missing llama- prefixes * Updating docs for eval-callback binary to use new `llama-` prefix. * Updating a few lingering doc references for rename of main to llama-cli * Updating `run-with-preset.py` to use new binary names. Updating docs around `perplexity` binary rename. * Updating documentation references for lookup-merge and export-lora * Updating two small `main` references missed earlier in the finetune docs. * Update apps.nix * update grammar/README.md w/ new llama-* names * update llama-rpc-server bin name + doc * Revert "update llama-rpc-server bin name + doc" This reverts commit e474ef1df481fd8936cd7d098e3065d7de378930. * add hot topic notice to README.md * Update README.md * Update README.md * rename gguf-split & quantize bins refs in **/tests.sh --------- Co-authored-by: HanClinto <hanclinto@gmail.com> 2024-06-12 23:41:52 +00:00			`ExecStart=/usr/bin/llama-cuda-server $LLAMA_ARGS`
devops : added systemd units and set versioning to use date. (#2835) * Corrections and systemd units * Missing dependency clblast 2023-08-28 06:31:24 +00:00			`ExecReload=/bin/kill -s HUP $MAINPID`
			`Restart=never`

			`[Install]`
			`WantedBy=default.target`
			`EOF`

			`mkdir -p %{buildroot}/etc/sysconfig`
			`%{__cat} <<EOF > %{buildroot}/etc/sysconfig/llama`
			`LLAMA_ARGS="-m /opt/llama2/ggml-model-f32.bin"`
			`EOF`

devops : RPM Specs (#2723) * Create llama-cpp.srpm * Rename llama-cpp.srpm to llama-cpp.srpm.spec Correcting extension. * Tested spec success. * Update llama-cpp.srpm.spec * Create lamma-cpp-cublas.srpm.spec * Create lamma-cpp-clblast.srpm.spec * Update lamma-cpp-cublas.srpm.spec Added BuildRequires * Moved to devops dir 2023-08-23 14:28:22 +00:00			`%clean`
			`rm -rf %{buildroot}`
			`rm -rf %{_builddir}/*`

			`%files`
`build`: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809) * `main`/`server`: rename to `llama` / `llama-server` for consistency w/ homebrew * server: update refs -> llama-server gitignore llama-server * server: simplify nix package * main: update refs -> llama fix examples/main ref * main/server: fix targets * update more names * Update build.yml * rm accidentally checked in bins * update straggling refs * Update .gitignore * Update server-llm.sh * main: target name -> llama-cli * Prefix all example bins w/ llama- * fix main refs * rename {main->llama}-cmake-pkg binary * prefix more cmake targets w/ llama- * add/fix gbnf-validator subfolder to cmake * sort cmake example subdirs * rm bin files * fix llama-lookup-* Makefile rules * gitignore /llama-* * rename Dockerfiles * rename llama\|main -> llama-cli; consistent RPM bin prefixes * fix some missing -cli suffixes * rename dockerfile w/ llama-cli * rename(make): llama-baby-llama * update dockerfile refs * more llama-cli(.exe) * fix test-eval-callback * rename: llama-cli-cmake-pkg(.exe) * address gbnf-validator unused fread warning (switched to C++ / ifstream) * add two missing llama- prefixes * Updating docs for eval-callback binary to use new `llama-` prefix. * Updating a few lingering doc references for rename of main to llama-cli * Updating `run-with-preset.py` to use new binary names. Updating docs around `perplexity` binary rename. * Updating documentation references for lookup-merge and export-lora * Updating two small `main` references missed earlier in the finetune docs. * Update apps.nix * update grammar/README.md w/ new llama-* names * update llama-rpc-server bin name + doc * Revert "update llama-rpc-server bin name + doc" This reverts commit e474ef1df481fd8936cd7d098e3065d7de378930. * add hot topic notice to README.md * Update README.md * Update README.md * rename gguf-split & quantize bins refs in **/tests.sh --------- Co-authored-by: HanClinto <hanclinto@gmail.com> 2024-06-12 23:41:52 +00:00			`%{_bindir}/llama-cuda-cli`
			`%{_bindir}/llama-cuda-server`
			`%{_bindir}/llama-cuda-simple`
cuda : rename build flag to LLAMA_CUDA (#6299) 2024-03-26 00:16:01 +00:00			`/usr/lib/systemd/system/llamacuda.service`
devops : added systemd units and set versioning to use date. (#2835) * Corrections and systemd units * Missing dependency clblast 2023-08-28 06:31:24 +00:00			`%config /etc/sysconfig/llama`
devops : RPM Specs (#2723) * Create llama-cpp.srpm * Rename llama-cpp.srpm to llama-cpp.srpm.spec Correcting extension. * Tested spec success. * Update llama-cpp.srpm.spec * Create lamma-cpp-cublas.srpm.spec * Create lamma-cpp-clblast.srpm.spec * Update lamma-cpp-cublas.srpm.spec Added BuildRequires * Moved to devops dir 2023-08-23 14:28:22 +00:00
			`%pre`

			`%post`

			`%preun`
			`%postun`

			`%changelog`