llama.cpp/examples/embedding
Xuan Son Nguyen 1b9ae5189c
common : refactor arg parser (#9308)
* (wip) argparser v3

* migrated

* add test

* handle env

* fix linux build

* add export-docs example

* fix build (2)

* skip build test-arg-parser on windows

* update server docs

* bring back missing --alias

* bring back --n-predict

* clarify test-arg-parser

* small correction

* add comments

* fix args with 2 values

* refine example-specific args

* no more lamba capture

Co-authored-by: slaren@users.noreply.github.com

* params.sparams

* optimize more

* export-docs --> gen-docs
2024-09-07 20:43:51 +02:00
..
CMakeLists.txt build: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809) 2024-06-13 00:41:52 +01:00
embedding.cpp common : refactor arg parser (#9308) 2024-09-07 20:43:51 +02:00
README.md embedding : add --pooling option to README.md [no ci] (#8934) 2024-08-09 09:33:30 +03:00

llama.cpp/example/embedding

This example demonstrates generate high-dimensional embedding vector of a given text with llama.cpp.

Quick Start

To get started right away, run the following command, making sure to use the correct path for the model you have:

Unix-based systems (Linux, macOS, etc.):

./llama-embedding -m ./path/to/model --pooling mean --log-disable -p "Hello World!" 2>/dev/null

Windows:

llama-embedding.exe -m ./path/to/model --pooling mean --log-disable -p "Hello World!" 2>$null

The above command will output space-separated float values.

extra parameters

--embd-normalize integer

integer description formula
-1 none
0 max absolute int16 \Large{{32760 * x_i} \over\max \lvert x_i\rvert}
1 taxicab \Large{x_i \over\sum \lvert x_i\rvert}
2 euclidean (default) \Large{x_i \over\sqrt{\sum x_i^2}}
>2 p-norm \Large{x_i \over\sqrt[p]{\sum \lvert x_i\rvert^p}}

--embd-output-format 'string'

'string' description
'' same as before (default)
'array' single embeddings [[x_1,...,x_n]]
multiple embeddings [[x_1,...,x_n],[x_1,...,x_n],...,[x_1,...,x_n]]
'json' openai style
'json+' add cosine similarity matrix

--embd-separator "string"

"string"
"\n" (default)
"<#embSep#>" for exemple
"<#sep#>" other exemple

examples

Unix-based systems (Linux, macOS, etc.):

./llama-embedding -p 'Castle<#sep#>Stronghold<#sep#>Dog<#sep#>Cat' --pooling mean --embd-separator '<#sep#>' --embd-normalize 2  --embd-output-format '' -m './path/to/model.gguf' --n-gpu-layers 99 --log-disable 2>/dev/null

Windows:

llama-embedding.exe -p 'Castle<#sep#>Stronghold<#sep#>Dog<#sep#>Cat' --pooling mean --embd-separator '<#sep#>' --embd-normalize 2  --embd-output-format '' -m './path/to/model.gguf' --n-gpu-layers 99 --log-disable 2>/dev/null