llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2024-11-11 21:39:52 +00:00

Author	SHA1	Message	Date
Mathijs de Bruin	633782b8d9	nix: now that we can do so, allow MacOS to build Vulkan binaries Author: Philip Taron <philip.taron@gmail.com> Date: Tue Feb 13 20:28:02 2024 +0000	2024-02-19 14:49:49 -08:00
Martin Schwaighofer	60ecf099ed	add Vulkan support to Nix flake	2024-02-03 13:13:07 -06:00
Someone Serge	b2d80e105a	flake.nix: add a comment about flakes vs nix	2024-01-22 12:19:30 +00:00
Philip Taron	bee938da74	nix: remove nixConfig from flake.nix (#4984 )	2024-01-16 09:56:21 -08:00
Someone Serge	198ed7ebfc	flake.nix: suggest the binary caches	2023-12-31 13:14:58 -08:00
Someone Serge	356ea17e0f	flake.nix: expose checks	2023-12-31 13:14:58 -08:00
Someone Serge	a5c088d8c6	flake.nix: rocm not yet supported on aarch64, so hide the output	2023-12-31 13:14:58 -08:00
Someone Serge	1e3900ebac	flake.nix: expose full scope in legacyPackages	2023-12-31 13:14:58 -08:00
Philip Taron	68eccbdc5b	flake.nix : rewrite (#4605 ) * flake.lock: update to hotfix CUDA::cuda_driver Required to support https://github.com/ggerganov/llama.cpp/pull/4606 * flake.nix: rewrite 1. Split into separate files per output. 2. Added overlays, so that this flake can be integrated into others. The names in the overlay are `llama-cpp`, `llama-cpp-opencl`, `llama-cpp-cuda`, and `llama-cpp-rocm` so that they fit into the broader set of Nix packages from [nixpkgs](https://github.com/nixos/nixpkgs). 3. Use [callPackage](https://summer.nixos.org/blog/callpackage-a-tool-for-the-lazy/) rather than `with pkgs;` so that there's dependency injection rather than dependency lookup. 4. Add a description and meta information for each package. The description includes a bit about what's trying to accelerate each one. 5. Use specific CUDA packages instead of cudatoolkit on the advice of SomeoneSerge. 6. Format with `serokell/nixfmt` for a consistent style. 7. Update `flake.lock` with the latest goods. * flake.nix: use finalPackage instead of passing it manually * nix: unclutter darwin support * nix: pass most darwin frameworks unconditionally ...for simplicity * .nix: nixfmt nix shell github:piegamesde/nixfmt/rfc101-style --command \ nixfmt flake.nix .devops/nix/.nix * flake.nix: add maintainers * nix: move meta down to follow Nixpkgs style more closely * nix: add missing meta attributes nix: clarify the interpretation of meta.maintainers nix: clarify the meaning of "broken" and "badPlatforms" nix: passthru: expose the use* flags for inspection E.g.: ``` ❯ nix eval .#cuda.useCuda true ``` * flake.nix: avoid re-evaluating nixpkgs too many times * flake.nix: use flake-parts * nix: migrate to pname+version * flake.nix: overlay: expose both the namespace and the default attribute * ci: add the (Nix) flakestry workflow * nix: cmakeFlags: explicit OFF bools * nix: cuda: reduce runtime closure * nix: fewer rebuilds * nix: respect config.cudaCapabilities * nix: add the impure driver's location to the DT_RUNPATHs * nix: clean sources more thoroughly ...this way outPaths change less frequently, and so there are fewer rebuilds * nix: explicit mpi support * nix: explicit jetson support * flake.nix: darwin: only expose the default --------- Co-authored-by: Someone Serge <sergei.kozlukov@aalto.fi>	2023-12-29 16:42:26 +02:00
Tungsten842	07178c98e1	flake.nix: fix for rocm 5.7 (#3853 )	2023-10-31 19:24:03 +02:00
Erik Scholz	ff3bad83e2	flake : update flake.lock for newer transformers version + provide extra dev shell (#3797 ) * flake : update flake.lock for newer transformers version + provide extra dev shell with torch and transformers (for most convert-xxx.py scripts)	2023-10-28 16:41:07 +02:00
Eve	017efe899d	cmake : make LLAMA_NATIVE flag actually use the instructions supported by the processor (#3273 ) * fix LLAMA_NATIVE * syntax * alternate implementation * my eyes must be getting bad... * set cmake LLAMA_NATIVE=ON by default * march=native doesn't work for ios/tvos, so disable for those targets. also see what happens if we use it on msvc * revert `8283237` and only allow LLAMA_NATIVE on x86 like the Makefile * remove -DLLAMA_MPI=ON --------- Co-authored-by: netrunnereve <netrunnereve@users.noreply.github.com>	2023-10-03 19:53:15 +03:00
Erik Scholz	a98b1633d5	nix : add cuda, use a symlinked toolkit for cmake (#3202 )	2023-09-25 13:48:30 +02:00
kang	80834daecf	flake : Restore default package's buildInputs (#3262 )	2023-09-20 15:48:22 +02:00
Evgeny Kurnevsky	235f7c193b	flake : use pkg-config instead of pkgconfig (#3188 ) pkgconfig is an alias, it got removed from nixpkgs: `295a5e1e2b/pkgs/top-level/aliases.nix (L1408)`	2023-09-15 11:10:22 +03:00
jneem	feea179e9f	flake : allow $out/include to already exist (#3175 )	2023-09-14 21:54:47 +03:00
Asbjørn Olling	cf8238e7f4	flake : include llama.h in nix output (#3159 )	2023-09-14 20:25:00 +03:00
takov751	ec2a24fedf	flake : add train-text-from-scratch to flake.nix (#3042 )	2023-09-08 19:06:26 +03:00
Tungsten842	61d1a2895e	flake.nix : add rocm support and cleanup (#2808 )	2023-08-26 21:19:44 +03:00
Volodymyr Vitvitskyi	f305bad11e	flake : build llama.cpp on Intel with nix (#2795 ) Problem ------- `nix build` fails with missing `Accelerate.h`. Changes ------- - Fix build of the llama.cpp with nix for Intel: add the same SDK frameworks as for ARM - Add `quantize` app to the output of nix flake - Extend nix devShell with llama-python so we can use convertScript Testing ------- Testing the steps with nix: 1. `nix build` Get the model and then 2. `nix develop` and then `python convert.py models/llama-2-7b.ggmlv3.q4_0.bin` 3. `nix run llama.cpp#quantize -- open_llama_7b/ggml-model-f16.gguf ./models/ggml-model-q4_0.bin 2` 4. `nix run llama.cpp#llama -- -m models/ggml-model-q4_0.bin -p "What is nix?" -n 400 --temp 0.8 -e -t 8` Co-authored-by: Volodymyr Vitvitskyi <volodymyrvitvitskyi@SamsungPro.local>	2023-08-26 16:25:39 +03:00
Shouzheng Liu	bf83bff674	metal : matrix-matrix multiplication kernel (#2615 ) * metal: matrix-matrix multiplication kernel This commit removes MPS and uses custom matrix-matrix multiplication kernels for all quantization types. This commit also adds grouped-query attention to support llama2 70B. * metal: fix performance degradation from gqa Integers are slow on the GPU, and 64-bit divides are extremely slow. In the context of GQA, we introduce a 64-bit divide that cannot be optimized out by the compiler, which results in a decrease of ~8% in inference performance. This commit fixes that issue by calculating a part of the offset with a 32-bit divide. Naturally, this limits the size of a single matrix to ~4GB. However, this limitation should suffice for the near future. * metal: fix bugs for GQA and perplexity test. I mixed up ne02 and nb02 in previous commit.	2023-08-16 23:07:04 +03:00
wzy	bc3ec2cdc9	flake : support `nix build '.#opencl'` (#2337 )	2023-07-23 14:57:02 +03:00
wzy	78a3d13424	flake : remove intel mkl from flake.nix due to missing files (#2277 ) NixOS's mkl misses some libraries like mkl-sdl.pc. See #2261 Currently NixOS doesn't have intel C compiler (icx, icpx). See https://discourse.nixos.org/t/packaging-intel-math-kernel-libraries-mkl/975 So remove it from flake.nix Some minor changes: - Change pkgs.python310 to pkgs.python3 to keep latest - Add pkgconfig to devShells.default - Remove installPhase because we have `cmake --install` from #2256	2023-07-21 13:26:34 +03:00
wzy	45a1b07e9b	flake : update flake.nix (#2270 ) When `isx86_32 \|\| isx86_64`, it will use mkl, else openblas According to https://discourse.nixos.org/t/rpath-of-binary-contains-a-forbidden-reference-to-build/12200/3, add -DCMAKE_SKIP_BUILD_RPATH=ON Fix #2261, Nix doesn't provide mkl-sdl.pc. When we build with -DBUILD_SHARED_LIBS=ON, -DLLAMA_BLAS_VENDOR=Intel10_lp64 replace mkl-sdl.pc by mkl-dynamic-lp64-iomp.pc	2023-07-19 10:01:55 +03:00
Dave Della Costa	a6803cab94	flake : add runHook preInstall/postInstall to installPhase so hooks function (#2224 )	2023-07-14 22:13:38 +03:00
Rowan Hart	fdd1860911	flake : fix ggml-metal.metal path and run nixfmt (#1974 )	2023-06-24 14:07:08 +03:00
Faez Shakil	fc45a81bc6	exposed modules so that they can be invoked by nix run github:ggerganov/llama.cpp#server etc (#1863 )	2023-06-17 14:13:05 +02:00
Andrei	303f5809f1	metal : fix issue with ggml-metal.metal path. Closes #1769 (#1782 ) * Fix issue with ggml-metal.metal path * Add ggml-metal.metal as a resource for llama target * Update flake.nix metal kernel substitution	2023-06-10 17:47:34 +03:00
jacobi petrucciani	5b57a5b726	flake : update to support metal on m1/m2 (#1724 )	2023-06-07 07:15:31 +03:00
Pavol Rusnak	bb98e77be7	nix: use convert.py instead of legacy wrapper convert-pth-to-ggml.py (#981 )	2023-04-25 23:19:57 +02:00
Pavol Rusnak	a32f7acc9f	py : cleanup dependencies (#962 ) after #545 we do not need torch, tqdm and requests in the dependencies	2023-04-14 15:37:11 +02:00
Pavol Rusnak	c729ff730a	flake.nix: add all binaries from bin (#848 )	2023-04-13 15:49:05 +02:00
lon	317fb12fbd	Add new binaries to flake.nix (#847 )	2023-04-08 12:04:23 +02:00
Ben Siraphob	a18c19259a	Fix Nix build	2023-03-23 17:51:26 +01:00
Ben Siraphob	bd4b46d6ba	Nix flake: set meta.mainProgram to llama	2023-03-20 22:50:22 +01:00
Niklas Korz	a292747893	Nix flake (#40 ) * Nix flake * Nix: only add Accelerate framework on macOS * Nix: development shel, direnv and compatibility * Nix: use python packages supplied by withPackages * Nix: remove channel compatibility * Nix: fix ARM neon dotproduct on macOS --------- Co-authored-by: Pavol Rusnak <pavol@rusnak.io>	2023-03-17 23:03:48 +01:00

36 Commits