llama.cpp/examples/embedding
Douglas Hanley 2891c8aa9a
Add support for BERT embedding models (#5423)
* BERT model graph construction (build_bert)
* WordPiece tokenizer (llm_tokenize_wpm)
* Add flag for non-causal attention models
* Allow for models that only output embeddings
* Support conversion of BERT models to GGUF
* Based on prior work by @xyzhang626 and @skeskinen

---------

Co-authored-by: Jared Van Bortel <jared@nomic.ai>
Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-02-11 11:21:38 -05:00
..
CMakeLists.txt build : link against build info instead of compiling against it (#3879) 2023-11-02 08:50:16 +02:00
embedding.cpp Add support for BERT embedding models (#5423) 2024-02-11 11:21:38 -05:00
README.md embedding : update README.md (#3224) 2023-09-21 11:57:40 +03:00

llama.cpp/example/embedding

This example demonstrates generate high-dimensional embedding vector of a given text with llama.cpp.

Quick Start

To get started right away, run the following command, making sure to use the correct path for the model you have:

Unix-based systems (Linux, macOS, etc.):

./embedding -m ./path/to/model --log-disable -p "Hello World!" 2>/dev/null

Windows:

embedding.exe -m ./path/to/model --log-disable -p "Hello World!" 2>$null

The above command will output space-separated float values.