root/llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2024-11-14 14:59:52 +00:00

History

Georgi Gerganov 55e47786e3 llama : default sampling changes + greedy update (#9897 ) * llama : deprecate softmax sampler + fix dist sampler ggml-ci * tests : replace macros with functions ggml-ci * sampling : change temperature sampler logic For t <= 0.0f, keep the max logit intact and set the rest to -inf * cont : no need for special "greedy" logic top-k == 1 is the same * tests : init prob correctly * llama : handle temp <= 0.0 in the temp_ext sampler too ggml-ci * cont : avoid extra loop in temperature sampler for sub-zero temp ggml-ci		2024-10-21 09:46:40 +03:00
..
CMakeLists.txt	`build`: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809 )	2024-06-13 00:41:52 +01:00
README.md	speculative : implement stochastic speculative sampling (#5625 )	2024-03-04 20:24:00 +02:00
speculative.cpp	llama : default sampling changes + greedy update (#9897 )	2024-10-21 09:46:40 +03:00

README.md

llama.cpp/examples/speculative

Demonstration of speculative decoding and tree-based speculative decoding techniques

More info: