llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2024-11-14 14:59:52 +00:00

History

Georgi Gerganov 6b0a7420d0 llama : KV cache view API + better KV cache management (#4170 ) * llama : keep track of used KV cells + better KV cache management * llama : zero KV cache used upon clear ggml-ci * llama : allow exporting a view of the KV cache (#4180) * Allow exporting a view of the KV cache * Allow dumping the sequences per cell in common * Track max contiguous cells value and position as well * Fix max contiguous empty cells index calculation Make dump functions deal with lengths or sequences counts > 10 better * Fix off by one error in dump_kv_cache_view * Add doc comments for KV cache view functions Eliminate cell sequence struct; use llama_seq_id directly Minor cleanups * common : add -dkvc arg for enabling kv cache dumps --------- Co-authored-by: Kerfuffle <44031344+KerfuffleV2@users.noreply.github.com>		2023-11-23 19:07:56 +02:00
..
base64.hpp	llava : expose as a shared library for downstream projects (#3613 )	2023-11-07 00:36:23 +03:00
build-info.cpp.in	build : link against build info instead of compiling against it (#3879 )	2023-11-02 08:50:16 +02:00
CMakeLists.txt	llava : expose as a shared library for downstream projects (#3613 )	2023-11-07 00:36:23 +03:00
common.cpp	llama : KV cache view API + better KV cache management (#4170 )	2023-11-23 19:07:56 +02:00
common.h	llama : KV cache view API + better KV cache management (#4170 )	2023-11-23 19:07:56 +02:00
console.cpp	check C++ code with -Wmissing-declarations (#3184 )	2023-09-15 15:38:27 -04:00
console.h	gguf : new file format with flexible meta data (beta) (#2398 )	2023-08-21 23:07:43 +03:00
grammar-parser.cpp	ggml : fix rope + llama minor optimizations (#3560 )	2023-10-20 13:02:12 +03:00
grammar-parser.h	gguf : new file format with flexible meta data (beta) (#2398 )	2023-08-21 23:07:43 +03:00
log.h	log : make generating separate log files optional (#3787 )	2023-11-01 16:18:27 +02:00
sampling.cpp	sampling : null grammar field after reset (#3885 )	2023-11-01 15:40:43 +02:00
sampling.h	samplers : Min-P sampler implementation [alternative to Top P/Top K] (#3841 )	2023-10-31 20:44:49 +01:00
stb_image.h	examples: support LLaVA v1.5 (multimodal model) (#3436 )	2023-10-12 18:23:18 +03:00
train.cpp	train : move number of gpu layers argument parsing to common/train.cpp (#4074 )	2023-11-17 17:19:16 +02:00
train.h	sync : ggml (backend v2) (#3912 )	2023-11-13 14:16:23 +02:00