Commit Graph

8 Commits

Author SHA1 Message Date
Esko Toivonen
6bda7ce6c3
llama : add pre-tokenizer regexes for BLOOM and gpt3-finnish (#8850) 2024-08-15 10:17:12 +03:00
Keke Han
081fe431aa
llama : fix codeshell support (#8599)
* llama : fix codeshell support

* llama : move codeshell after smollm below to respect the enum order
2024-07-22 19:43:43 +03:00
Jason Stillerman
d94c6e0ccb
llama : add support for SmolLm pre-tokenizer (#8609)
* Adding SmolLM Pre Tokenizer

* Update convert_hf_to_gguf_update.py

Co-authored-by: compilade <git@compilade.net>

* Update src/llama.cpp

Co-authored-by: compilade <git@compilade.net>

* handle regex

* removed .inp and out .out ggufs

---------

Co-authored-by: compilade <git@compilade.net>
2024-07-22 17:43:01 +03:00
Jiří Podivín
566daa5a5b
*.py: Stylistic adjustments for python (#8233)
* Superflous parens in conditionals were removed.
* Unused args in function were removed.
* Replaced unused `idx` var with `_`
* Initializing file_format and format_version attributes
* Renaming constant to capitals
* Preventing redefinition of the `f` var

Signed-off-by: Jiri Podivin <jpodivin@redhat.com>
2024-07-22 23:44:53 +10:00
Michael Coppola
940362224d
llama : add support for Tekken pre-tokenizer (#8579)
* llama : Added support for Tekken pre-tokenizer (#8577)

Removed uneeded `vocab.tokenizer_clean_spaces` assignment

* llama : fix order of pre-tokenizers

* * Tekken pre-tokenizer no longer uses clean_up_tokenization_spaces
* Updated chkhsh for Tekken tokenizer

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-07-20 16:43:51 +03:00
Georgi Gerganov
e235b267a2
py : switch to snake_case (#8305)
* py : switch to snake_case

ggml-ci

* cont

ggml-ci

* cont

ggml-ci

* cont : fix link

* gguf-py : use snake_case in scripts entrypoint export

* py : rename requirements for convert_legacy_llama.py

Needed for scripts/check-requirements.sh

---------

Co-authored-by: Francis Couture-Harpin <git@compilade.net>
2024-07-05 07:53:33 +03:00
ditsuke
01a5f06550 chore: Remove rebase artifacts 2024-07-04 15:39:13 +00:00
ditsuke
b0a46993df build(python): Package scripts with pip-0517 compliance 2024-07-04 15:39:13 +00:00