Commit Graph

2 Commits

Author SHA1 Message Date
Douglas Hanley
2891c8aa9a
Add support for BERT embedding models (#5423)
* BERT model graph construction (build_bert)
* WordPiece tokenizer (llm_tokenize_wpm)
* Add flag for non-causal attention models
* Allow for models that only output embeddings
* Support conversion of BERT models to GGUF
* Based on prior work by @xyzhang626 and @skeskinen

---------

Co-authored-by: Jared Van Bortel <jared@nomic.ai>
Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-02-11 11:21:38 -05:00
Jiří Podivín
5ddf7ea1fb
hooks : setting up flake8 and pre-commit hooks (#1681)
Small, non-functional changes were made to non-compliant files.
These include breaking up long lines, whitespace sanitation and
unused import removal.

Maximum line length in python files was set to a generous 125 chars,
in order to minimize number of changes needed in scripts and general
annoyance. The "txt" prompts directory is excluded from the checks
as it may contain oddly formatted files and strings for a good reason.

Signed-off-by: Jiri Podivin <jpodivin@gmail.com>
2023-06-17 13:32:48 +03:00