llama.cpp/gguf-split at 8cef75c743ba13ebbd6d380c531200c768a8b8aa - llama.cpp - Gitea: Git with a cup of tea

root/llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2025-01-12 03:31:46 +00:00

History

Johannes Gäßler 53ff6b9b9f GGUF: C++ refactor, backend support, misc fixes (#11030 ) * GGUF: C++ refactor, backend support, misc fixes remove ggml_tensor.backend update CODEOWNERS [no ci] remove gguf_get_data from API revise GGUF API data types		2025-01-07 18:01:58 +01:00
..
CMakeLists.txt	ggml : move AMX to the CPU backend (#10570 )	2024-11-29 21:54:58 +01:00
gguf-split.cpp	GGUF: C++ refactor, backend support, misc fixes (#11030 )	2025-01-07 18:01:58 +01:00
README.md	Fix --split-max-size (#6655 )	2024-04-14 13:12:59 +02:00
tests.sh	`build`: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809 )	2024-06-13 00:41:52 +01:00

README.md

GGUF split Example

CLI to split / merge GGUF files.

Command line options:

--split: split GGUF to multiple GGUF, default operation.
--split-max-size: max size per split in M or G, f.ex. 500M or 2G.
--split-max-tensors: maximum tensors in each split: default(128)
--merge: merge multiple GGUF to a single GGUF.