daminho
c837981bba
py : add Phi-1.5/Phi-2 tokenizer ( #9361 )
...
* add phi2 tokenizer
* add phi name to convert_hf_to_gguf_update.py
* make tokenizer_pre consistent; llama.cpp work
2024-09-12 14:28:20 +03:00
Pavel Zloi
8db003a19d
py : support converting local models ( #7547 )
...
* Support of converting local models added to convert-hf-to-gguf-update.py
* Description fixed
* shutil added to imports
2024-09-11 15:29:51 +03:00
Minsoo Cheong
c679e0cb5c
llama : add EXAONE model support ( #9025 )
...
* add exaone model support
* add chat template
* fix whitespace
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* add ftype
* add exaone pre-tokenizer in `llama-vocab.cpp`
Co-Authored-By: compilade <113953597+compilade@users.noreply.github.com>
* fix lint
Co-Authored-By: compilade <113953597+compilade@users.noreply.github.com>
* add `EXAONE` to supported models in `README.md`
* fix space
Co-authored-by: compilade <git@compilade.net>
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: compilade <113953597+compilade@users.noreply.github.com>
Co-authored-by: compilade <git@compilade.net>
2024-08-16 09:35:18 +03:00
Esko Toivonen
6bda7ce6c3
llama : add pre-tokenizer regexes for BLOOM and gpt3-finnish ( #8850 )
2024-08-15 10:17:12 +03:00
Keke Han
081fe431aa
llama : fix codeshell support ( #8599 )
...
* llama : fix codeshell support
* llama : move codeshell after smollm below to respect the enum order
2024-07-22 19:43:43 +03:00
Jason Stillerman
d94c6e0ccb
llama : add support for SmolLm pre-tokenizer ( #8609 )
...
* Adding SmolLM Pre Tokenizer
* Update convert_hf_to_gguf_update.py
Co-authored-by: compilade <git@compilade.net>
* Update src/llama.cpp
Co-authored-by: compilade <git@compilade.net>
* handle regex
* removed .inp and out .out ggufs
---------
Co-authored-by: compilade <git@compilade.net>
2024-07-22 17:43:01 +03:00
Jiří Podivín
566daa5a5b
*.py: Stylistic adjustments for python ( #8233 )
...
* Superflous parens in conditionals were removed.
* Unused args in function were removed.
* Replaced unused `idx` var with `_`
* Initializing file_format and format_version attributes
* Renaming constant to capitals
* Preventing redefinition of the `f` var
Signed-off-by: Jiri Podivin <jpodivin@redhat.com>
2024-07-22 23:44:53 +10:00
Michael Coppola
940362224d
llama : add support for Tekken pre-tokenizer ( #8579 )
...
* llama : Added support for Tekken pre-tokenizer (#8577 )
Removed uneeded `vocab.tokenizer_clean_spaces` assignment
* llama : fix order of pre-tokenizers
* * Tekken pre-tokenizer no longer uses clean_up_tokenization_spaces
* Updated chkhsh for Tekken tokenizer
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-07-20 16:43:51 +03:00
Georgi Gerganov
e235b267a2
py : switch to snake_case ( #8305 )
...
* py : switch to snake_case
ggml-ci
* cont
ggml-ci
* cont
ggml-ci
* cont : fix link
* gguf-py : use snake_case in scripts entrypoint export
* py : rename requirements for convert_legacy_llama.py
Needed for scripts/check-requirements.sh
---------
Co-authored-by: Francis Couture-Harpin <git@compilade.net>
2024-07-05 07:53:33 +03:00
ditsuke
01a5f06550
chore: Remove rebase artifacts
2024-07-04 15:39:13 +00:00
ditsuke
b0a46993df
build(python): Package scripts with pip-0517 compliance
2024-07-04 15:39:13 +00:00