* style: format with nixfmt/rfc101-style
* build(nix): Package gguf-py
* build(nix): Refactor to new scope for gguf-py
* build(nix): Exclude gguf-py from devShells
* build(nix): Refactor gguf-py derivation to take in exact deps
* build(nix): Enable pytestCheckHook and pythonImportsCheck for gguf-py
* build(python): Package python scripts with pyproject.toml
* chore: Cleanup
* dev(nix): Break up python/C devShells
* build(python): Relax pytorch version constraint
Nix has an older version
* chore: Move cmake to nativeBuildInputs for devShell
* fmt: Reconcile formatting with rebase
* style: nix fmt
* cleanup: Remove unncessary __init__.py
* chore: Suggestions from review
- Filter out non-source files from llama-scripts flake derivation
- Clean up unused closure
- Remove scripts devShell
* revert: Bad changes
* dev: Simplify devShells, restore the -extra devShell
* build(nix): Add pyyaml for gguf-py
* chore: Remove some unused bindings
* dev: Add tiktoken to -extra devShells
Exposes a few attributes demonstrating how to build [singularity](https://docs.sylabs.io/guides/latest/user-guide/)/[apptainer](https://apptainer.org/) and Docker images re-using llama.cpp's Nix expression.
Built locally on `x86_64-linux` with `nix build github:someoneserge/llama.cpp/feat/nix/images#llamaPackages.{docker,docker-min,sif,llama-cpp}` and it's fast and effective.
* flake.lock: update to hotfix CUDA::cuda_driver
Required to support https://github.com/ggerganov/llama.cpp/pull/4606
* flake.nix: rewrite
1. Split into separate files per output.
2. Added overlays, so that this flake can be integrated into others.
The names in the overlay are `llama-cpp`, `llama-cpp-opencl`,
`llama-cpp-cuda`, and `llama-cpp-rocm` so that they fit into the
broader set of Nix packages from [nixpkgs](https://github.com/nixos/nixpkgs).
3. Use [callPackage](https://summer.nixos.org/blog/callpackage-a-tool-for-the-lazy/)
rather than `with pkgs;` so that there's dependency injection rather
than dependency lookup.
4. Add a description and meta information for each package.
The description includes a bit about what's trying to accelerate each one.
5. Use specific CUDA packages instead of cudatoolkit on the advice of SomeoneSerge.
6. Format with `serokell/nixfmt` for a consistent style.
7. Update `flake.lock` with the latest goods.
* flake.nix: use finalPackage instead of passing it manually
* nix: unclutter darwin support
* nix: pass most darwin frameworks unconditionally
...for simplicity
* *.nix: nixfmt
nix shell github:piegamesde/nixfmt/rfc101-style --command \
nixfmt flake.nix .devops/nix/*.nix
* flake.nix: add maintainers
* nix: move meta down to follow Nixpkgs style more closely
* nix: add missing meta attributes
nix: clarify the interpretation of meta.maintainers
nix: clarify the meaning of "broken" and "badPlatforms"
nix: passthru: expose the use* flags for inspection
E.g.:
```
❯ nix eval .#cuda.useCuda
true
```
* flake.nix: avoid re-evaluating nixpkgs too many times
* flake.nix: use flake-parts
* nix: migrate to pname+version
* flake.nix: overlay: expose both the namespace and the default attribute
* ci: add the (Nix) flakestry workflow
* nix: cmakeFlags: explicit OFF bools
* nix: cuda: reduce runtime closure
* nix: fewer rebuilds
* nix: respect config.cudaCapabilities
* nix: add the impure driver's location to the DT_RUNPATHs
* nix: clean sources more thoroughly
...this way outPaths change less frequently,
and so there are fewer rebuilds
* nix: explicit mpi support
* nix: explicit jetson support
* flake.nix: darwin: only expose the default
---------
Co-authored-by: Someone Serge <sergei.kozlukov@aalto.fi>