llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2024-11-13 14:29:52 +00:00

History

Douglas Hanley 2891c8aa9a Add support for BERT embedding models (#5423 ) * BERT model graph construction (build_bert) * WordPiece tokenizer (llm_tokenize_wpm) * Add flag for non-causal attention models * Allow for models that only output embeddings * Support conversion of BERT models to GGUF * Based on prior work by @xyzhang626 and @skeskinen --------- Co-authored-by: Jared Van Bortel <jared@nomic.ai> Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>		2024-02-11 11:21:38 -05:00
..
__init__.py	gguf-py: Refactor and allow reading/modifying existing GGUF files (#3981 )	2023-11-11 08:04:50 +03:00
constants.py	Add support for BERT embedding models (#5423 )	2024-02-11 11:21:38 -05:00
gguf_reader.py	gguf : fix "general.alignment" type in gguf_reader.py (#5136 )	2024-01-26 11:10:28 +02:00
gguf_writer.py	Add support for BERT embedding models (#5423 )	2024-02-11 11:21:38 -05:00
gguf.py	gguf-py: Refactor and allow reading/modifying existing GGUF files (#3981 )	2023-11-11 08:04:50 +03:00
py.typed	convert : various script cleanups/fixes + merges and special token handling (#2842 )	2023-08-30 11:25:50 +03:00
tensor_mapping.py	Add support for BERT embedding models (#5423 )	2024-02-11 11:21:38 -05:00
vocab.py	py : open merges file as 'utf-8' (#4566 )	2023-12-21 19:07:34 +02:00