llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2025-01-12 19:50:17 +00:00

Author	SHA1	Message	Date
Olivier Chafik	5265c15d4c	rename llama\|main -> llama-cli; consistent RPM bin prefixes	2024-06-10 15:34:14 +01:00
Olivier Chafik	ab5efbb3b6	Prefix all example bins w/ llama-	2024-06-08 13:42:01 +01:00
Olivier Chafik	8b7c734473	main: update refs -> llama fix examples/main ref	2024-06-06 15:44:51 +01:00
Ahmet Zeer	07cd41d096	TypoFix (#7162 )	2024-05-09 10:16:45 +02:00
fraxy-v	92397d87a4	convert-llama2c-to-ggml : enable conversion of GQA models (#6237 ) * convert-llama2c-to-ggml: enable conversion of multiqueries, #5608 * add test in build action * Update build.yml * Update build.yml * Update build.yml * gg patch	2024-03-22 20:49:06 +02:00
Olivier Chafik	230d46c723	examples : update llama2.c converter to read vocab and write models in GGUF format (#2751 ) * llama2.c: direct gguf output (WIP) * Simplify vector building logic * llama2.c gguf conversion: fix token types in converter * llama2.c: support copying vocab from a llama gguf model file * llama2.c: update default path for vocab model + readme * llama2.c: use defines for gguf keys * llama2.c: escape whitespaces w/ U+2581 in vocab converter the llama.cpp way * llama2.c converter: cleanups + take n_ff from config	2023-08-27 17:13:31 +03:00
Olivier Chafik	95385241a9	examples : restore the functionality to import llama2.c models (#2685 ) * Fix import of llama2.c models that don't share weights between embedding layers * llama2c: reinstate ggmlv3 conversion output + update readme w/ gguf conv * llama2.c: comment out legacy "load from ggml model" logic * llama2.c: convert special-cased "<0xXX>" single byte tokens from tokenizer.bin	2023-08-23 22:33:05 +03:00
byte-6174	b19edd54d5	Adding support for llama2.c models (#2559 )	2023-08-12 01:17:25 +02:00

8 Commits