Commit Graph

2 Commits

Author SHA1 Message Date
fairydreaming
d3f0c7166a
Stop the generation when <|eom_id|> token is encountered - needed for Llama 3.1 tool call support (#8858)
* gguf-py, llama : add constants and methods related to Llama-3.1 <|eom_id|> token

* llama : find Llama-3.1 <|eom_id|> token id during vocab loading

* llama-vocab : add Llama-3.1 <|eom_id|> token to the set of tokens stopping the generation

---------

Co-authored-by: Stanisław Szymczyk <sszymczy@gmail.com>
2024-08-05 09:38:01 +02:00
Georgi Gerganov
938943cdbf
llama : move vocab, grammar and sampling into separate files (#8508)
* llama : move sampling code into llama-sampling

ggml-ci

* llama : move grammar code into llama-grammar

ggml-ci

* cont

ggml-ci

* cont : pre-fetch rules

* cont

ggml-ci

* llama : deprecate llama_sample_grammar

* llama : move tokenizers into llama-vocab

ggml-ci

* make : update llama.cpp deps [no ci]

* llama : redirect external API to internal APIs

ggml-ci

* llama : suffix the internal APIs with "_impl"

ggml-ci

* llama : clean-up
2024-07-23 13:10:17 +03:00