Minsoo Cheong
c679e0cb5c
llama : add EXAONE model support ( #9025 )
...
* add exaone model support
* add chat template
* fix whitespace
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* add ftype
* add exaone pre-tokenizer in `llama-vocab.cpp`
Co-Authored-By: compilade <113953597+compilade@users.noreply.github.com>
* fix lint
Co-Authored-By: compilade <113953597+compilade@users.noreply.github.com>
* add `EXAONE` to supported models in `README.md`
* fix space
Co-authored-by: compilade <git@compilade.net>
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: compilade <113953597+compilade@users.noreply.github.com>
Co-authored-by: compilade <git@compilade.net>
2024-08-16 09:35:18 +03:00
Zhenwei Jin
4af8420afb
common : remove duplicate function llama_should_add_bos_token ( #8778 )
2024-08-15 10:23:23 +03:00
Esko Toivonen
6bda7ce6c3
llama : add pre-tokenizer regexes for BLOOM and gpt3-finnish ( #8850 )
2024-08-15 10:17:12 +03:00
Georgi Gerganov
45a55b91aa
llama : better replace_all (cont) ( #8926 )
...
* llama : better replace_all (cont)
ggml-ci
* code : deduplicate replace_all
ggml-ci
2024-08-09 18:23:52 +03:00
Douglas Hanley
cdd1889de6
convert : add support for XLMRoberta embedding models ( #8658 )
...
* add conversion for bge-m3; small fix in unigram tokenizer
* clean up and simplify XLMRoberta conversion
2024-08-06 10:20:54 +03:00
fairydreaming
d3f0c7166a
Stop the generation when <|eom_id|> token is encountered - needed for Llama 3.1 tool call support ( #8858 )
...
* gguf-py, llama : add constants and methods related to Llama-3.1 <|eom_id|> token
* llama : find Llama-3.1 <|eom_id|> token id during vocab loading
* llama-vocab : add Llama-3.1 <|eom_id|> token to the set of tokens stopping the generation
---------
Co-authored-by: Stanisław Szymczyk <sszymczy@gmail.com>
2024-08-05 09:38:01 +02:00
slaren
2b1f616b20
ggml : reduce hash table reset cost ( #8698 )
...
* ggml : reduce hash table reset cost
* fix unreachable code warnings after GGML_ASSERT(false)
* GGML_ASSERT(false) -> GGML_ABORT("fatal error")
* GGML_ABORT use format string
2024-07-27 04:41:55 +02:00
Georgi Gerganov
938943cdbf
llama : move vocab, grammar and sampling into separate files ( #8508 )
...
* llama : move sampling code into llama-sampling
ggml-ci
* llama : move grammar code into llama-grammar
ggml-ci
* cont
ggml-ci
* cont : pre-fetch rules
* cont
ggml-ci
* llama : deprecate llama_sample_grammar
* llama : move tokenizers into llama-vocab
ggml-ci
* make : update llama.cpp deps [no ci]
* llama : redirect external API to internal APIs
ggml-ci
* llama : suffix the internal APIs with "_impl"
ggml-ci
* llama : clean-up
2024-07-23 13:10:17 +03:00