llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2024-11-14 23:09:53 +00:00

History

Georgi Gerganov b0f27361f3 sampling : avoid expensive softmax during greedy sampling (#9605 ) * sampling : avoid expensive softmax during greedy sampling ggml-ci * speculative : fix default RNG seed + set sparams.n_probs * Update tests/test-sampling.cpp Co-authored-by: slaren <slarengh@gmail.com> * sampling : add clarifying comment [no ci] --------- Co-authored-by: slaren <slarengh@gmail.com>	2024-09-24 09:03:17 +03:00
..
llama.h	sampling : avoid expensive softmax during greedy sampling (#9605 )	2024-09-24 09:03:17 +03:00

Georgi Gerganov b0f27361f3

sampling : avoid expensive softmax during greedy sampling (#9605 )

* sampling : avoid expensive softmax during greedy sampling

ggml-ci

* speculative : fix default RNG seed + set sparams.n_probs

* Update tests/test-sampling.cpp

Co-authored-by: slaren <slarengh@gmail.com>

* sampling : add clarifying comment [no ci]

---------

Co-authored-by: slaren <slarengh@gmail.com>

2024-09-24 09:03:17 +03:00

llama.h

sampling : avoid expensive softmax during greedy sampling (#9605 )

2024-09-24 09:03:17 +03:00