llama.cpp/ggml-vocab-llama.gguf at 66a66a05a8c509e57e170659182f96165d2e8112

mirror of https://github.com/ggerganov/llama.cpp.git synced 2024-12-28 12:24:35 +00:00

llama : improve token type support (#2668 )

* Merge tokenizer fixes into the gguf branch.

* Add test vocabularies

* Adapt convert-new.py (and fix a clang-cl compiler error on windows)

* Improved tokenizer test

But does it work on MacOS?

* Improve token type support

- Added @klosax code to convert.py
- Improved token type support in vocabulary

* Exclude platform dependent tests

* More sentencepiece compatibility by eliminating magic numbers

* Restored accidentally removed comment

2023-08-21 18:56:02 +03:00

582 KiB

Raw History

View Raw

582 KiB Raw History

582 KiB

Raw History