mirror of https://github.com/ggerganov/llama.cpp.git synced 2025-01-12 19:50:17 +00:00

History

Xuan Son Nguyen 1b9ae5189c common : refactor arg parser (#9308 ) * (wip) argparser v3 * migrated * add test * handle env * fix linux build * add export-docs example * fix build (2) * skip build test-arg-parser on windows * update server docs * bring back missing --alias * bring back --n-predict * clarify test-arg-parser * small correction * add comments * fix args with 2 values * refine example-specific args * no more lamba capture Co-authored-by: slaren@users.noreply.github.com * params.sparams * optimize more * export-docs --> gen-docs		2024-09-07 20:43:51 +02:00
..
CMakeLists.txt	`build`: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809 )	2024-06-13 00:41:52 +01:00
embedding.cpp	common : refactor arg parser (#9308 )	2024-09-07 20:43:51 +02:00
README.md	embedding : add --pooling option to README.md [no ci] (#8934 )	2024-08-09 09:33:30 +03:00

README.md

llama.cpp/example/embedding

This example demonstrates generate high-dimensional embedding vector of a given text with llama.cpp.

Quick Start

To get started right away, run the following command, making sure to use the correct path for the model you have:

Unix-based systems (Linux, macOS, etc.):

./llama-embedding -m ./path/to/model --pooling mean --log-disable -p "Hello World!" 2>/dev/null

Windows:

llama-embedding.exe -m ./path/to/model --pooling mean --log-disable -p "Hello World!" 2>$null

The above command will output space-separated float values.

extra parameters

--embd-normalize `integer`

`integer`	description	formula
`-1`	none
`0`	max absolute int16	`\Large{{32760 * x_i} \over\max \lvert x_i\rvert}`
`1`	taxicab	`\Large{x_i \over\sum \lvert x_i\rvert}`
`2`	euclidean (default)	`\Large{x_i \over\sqrt{\sum x_i^2}}`
`>2`	p-norm	`\Large{x_i \over\sqrt[p]{\sum \lvert x_i\rvert^p}}`

--embd-output-format `'string'`

`'string'`	description
''	same as before	(default)
'array'	single embeddings	`[[x_1,...,x_n]]`
	multiple embeddings	`[[x_1,...,x_n],[x_1,...,x_n],...,[x_1,...,x_n]]`
'json'	openai style
'json+'	add cosine similarity matrix

--embd-separator `"string"`

`"string"`
"\n"	(default)
"<#embSep#>"	for exemple
"<#sep#>"	other exemple

examples

Unix-based systems (Linux, macOS, etc.):

./llama-embedding -p 'Castle<#sep#>Stronghold<#sep#>Dog<#sep#>Cat' --pooling mean --embd-separator '<#sep#>' --embd-normalize 2  --embd-output-format '' -m './path/to/model.gguf' --n-gpu-layers 99 --log-disable 2>/dev/null

Windows:

llama-embedding.exe -p 'Castle<#sep#>Stronghold<#sep#>Dog<#sep#>Cat' --pooling mean --embd-separator '<#sep#>' --embd-normalize 2  --embd-output-format '' -m './path/to/model.gguf' --n-gpu-layers 99 --log-disable 2>/dev/null

README.md

llama.cpp/example/embedding

Quick Start

Unix-based systems (Linux, macOS, etc.):

Windows:

extra parameters

--embd-normalize integer

--embd-output-format 'string'

--embd-separator "string"

examples

Unix-based systems (Linux, macOS, etc.):

Windows:

--embd-normalize `integer`

--embd-output-format `'string'`

--embd-separator `"string"`