llama.cpp/examples/gritlm/README.md

## Generative Representational Instruction Tuning (GRIT) Example
[gritlm] a model which can generate embeddings as well as "normal" text
generation depending on the instructions in the prompt.

* Paper: https://arxiv.org/pdf/2402.09906.pdf

### Retrieval-Augmented Generation (RAG) use case
One use case for `gritlm` is to use it with RAG. If we recall how RAG works is
that we take documents that we want to use as context, to ground the large
language model (LLM), and we create token embeddings for them. We then store
these token embeddings in a vector database.

When we perform a query, prompt the LLM, we will first create token embeddings
for the query and then search the vector database to retrieve the most
similar vectors, and return those documents so they can be passed to the LLM as
context. Then the query and the context will be passed to the LLM which will
have to _again_ create token embeddings for the query. But because gritlm is used
the first query can be cached and the second query tokenization generation does
not have to be performed at all.

### Running the example
Download a Grit model:
```console
$ scripts/hf.sh --repo cohesionet/GritLM-7B_gguf --file gritlm-7b_q4_1.gguf --outdir models
```

Run the example using the downloaded model:
```console
$ ./llama-gritlm -m models/gritlm-7b_q4_1.gguf

Cosine similarity between "Bitcoin: A Peer-to-Peer Electronic Cash System" and "A purely peer-to-peer version of electronic cash w" is: 0.605
Cosine similarity between "Bitcoin: A Peer-to-Peer Electronic Cash System" and "All text-based language problems can be reduced to" is: 0.103
Cosine similarity between "Generative Representational Instruction Tuning" and "A purely peer-to-peer version of electronic cash w" is: 0.112
Cosine similarity between "Generative Representational Instruction Tuning" and "All text-based language problems can be reduced to" is: 0.547

Oh, brave adventurer, who dared to climb
The lofty peak of Mt. Fuji in the night,
When shadows lurk and ghosts do roam,
And darkness reigns, a fearsome sight.

Thou didst set out, with heart aglow,
To conquer this mountain, so high,
And reach the summit, where the stars do glow,
And the moon shines bright, up in the sky.

Through the mist and fog, thou didst press on,
With steadfast courage, and a steadfast will,
Through the darkness, thou didst not be gone,
But didst climb on, with a steadfast skill.

At last, thou didst reach the summit's crest,
And gazed upon the world below,
And saw the beauty of the night's best,
And felt the peace, that only nature knows.

Oh, brave adventurer, who dared to climb
The lofty peak of Mt. Fuji in the night,
Thou art a hero, in the eyes of all,
For thou didst conquer this mountain, so bright.
```

[gritlm]: https://github.com/ContextualAI/gritlm
gritlm : add initial README.md (#6086) * gritlm: add initial README.md to examples/gritlm This commit adds a suggestion for an initial README.md for the gritlm example. Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com> * squash! gritlm: add initial README.md to examples/gritlm Use the `scripts/hf.sh` script to download the model file. Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com> * squash! gritlm: add initial README.md to examples/gritlm Fix editorconfig-checker error in examples/gritlm/README.md. Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com> --------- Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com> 2024-03-16 15:46:29 +00:00			`## Generative Representational Instruction Tuning (GRIT) Example`
			`[gritlm] a model which can generate embeddings as well as "normal" text`
			`generation depending on the instructions in the prompt.`

			`* Paper: https://arxiv.org/pdf/2402.09906.pdf`

			`### Retrieval-Augmented Generation (RAG) use case`
			One use case for `gritlm` is to use it with RAG. If we recall how RAG works is
			`that we take documents that we want to use as context, to ground the large`
			`language model (LLM), and we create token embeddings for them. We then store`
			`these token embeddings in a vector database.`

			`When we perform a query, prompt the LLM, we will first create token embeddings`
			`for the query and then search the vector database to retrieve the most`
			`similar vectors, and return those documents so they can be passed to the LLM as`
			`context. Then the query and the context will be passed to the LLM which will`
			`have to _again_ create token embeddings for the query. But because gritlm is used`
			`the first query can be cached and the second query tokenization generation does`
			`not have to be performed at all.`

			`### Running the example`
			`Download a Grit model:`
			```console
gritlm : add --outdir option to hf.sh script (#6699) This commit updates the hf.sh script usage to include the --outdir option and specifies the models directory as the output directory. The motivation for this is to avoid cluttering the root directory with model files. Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com> 2024-04-16 06:34:06 +00:00			`$ scripts/hf.sh --repo cohesionet/GritLM-7B_gguf --file gritlm-7b_q4_1.gguf --outdir models`
gritlm : add initial README.md (#6086) * gritlm: add initial README.md to examples/gritlm This commit adds a suggestion for an initial README.md for the gritlm example. Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com> * squash! gritlm: add initial README.md to examples/gritlm Use the `scripts/hf.sh` script to download the model file. Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com> * squash! gritlm: add initial README.md to examples/gritlm Fix editorconfig-checker error in examples/gritlm/README.md. Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com> --------- Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com> 2024-03-16 15:46:29 +00:00			```

			`Run the example using the downloaded model:`
			```console
`build`: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809) * `main`/`server`: rename to `llama` / `llama-server` for consistency w/ homebrew * server: update refs -> llama-server gitignore llama-server * server: simplify nix package * main: update refs -> llama fix examples/main ref * main/server: fix targets * update more names * Update build.yml * rm accidentally checked in bins * update straggling refs * Update .gitignore * Update server-llm.sh * main: target name -> llama-cli * Prefix all example bins w/ llama- * fix main refs * rename {main->llama}-cmake-pkg binary * prefix more cmake targets w/ llama- * add/fix gbnf-validator subfolder to cmake * sort cmake example subdirs * rm bin files * fix llama-lookup-* Makefile rules * gitignore /llama-* * rename Dockerfiles * rename llama\|main -> llama-cli; consistent RPM bin prefixes * fix some missing -cli suffixes * rename dockerfile w/ llama-cli * rename(make): llama-baby-llama * update dockerfile refs * more llama-cli(.exe) * fix test-eval-callback * rename: llama-cli-cmake-pkg(.exe) * address gbnf-validator unused fread warning (switched to C++ / ifstream) * add two missing llama- prefixes * Updating docs for eval-callback binary to use new `llama-` prefix. * Updating a few lingering doc references for rename of main to llama-cli * Updating `run-with-preset.py` to use new binary names. Updating docs around `perplexity` binary rename. * Updating documentation references for lookup-merge and export-lora * Updating two small `main` references missed earlier in the finetune docs. * Update apps.nix * update grammar/README.md w/ new llama-* names * update llama-rpc-server bin name + doc * Revert "update llama-rpc-server bin name + doc" This reverts commit e474ef1df481fd8936cd7d098e3065d7de378930. * add hot topic notice to README.md * Update README.md * Update README.md * rename gguf-split & quantize bins refs in **/tests.sh --------- Co-authored-by: HanClinto <hanclinto@gmail.com> 2024-06-12 23:41:52 +00:00			`$ ./llama-gritlm -m models/gritlm-7b_q4_1.gguf`
gritlm : add initial README.md (#6086) * gritlm: add initial README.md to examples/gritlm This commit adds a suggestion for an initial README.md for the gritlm example. Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com> * squash! gritlm: add initial README.md to examples/gritlm Use the `scripts/hf.sh` script to download the model file. Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com> * squash! gritlm: add initial README.md to examples/gritlm Fix editorconfig-checker error in examples/gritlm/README.md. Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com> --------- Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com> 2024-03-16 15:46:29 +00:00
			`Cosine similarity between "Bitcoin: A Peer-to-Peer Electronic Cash System" and "A purely peer-to-peer version of electronic cash w" is: 0.605`
			`Cosine similarity between "Bitcoin: A Peer-to-Peer Electronic Cash System" and "All text-based language problems can be reduced to" is: 0.103`
			`Cosine similarity between "Generative Representational Instruction Tuning" and "A purely peer-to-peer version of electronic cash w" is: 0.112`
			`Cosine similarity between "Generative Representational Instruction Tuning" and "All text-based language problems can be reduced to" is: 0.547`

			`Oh, brave adventurer, who dared to climb`
			`The lofty peak of Mt. Fuji in the night,`
			`When shadows lurk and ghosts do roam,`
			`And darkness reigns, a fearsome sight.`

			`Thou didst set out, with heart aglow,`
			`To conquer this mountain, so high,`
			`And reach the summit, where the stars do glow,`
			`And the moon shines bright, up in the sky.`

			`Through the mist and fog, thou didst press on,`
			`With steadfast courage, and a steadfast will,`
			`Through the darkness, thou didst not be gone,`
			`But didst climb on, with a steadfast skill.`

			`At last, thou didst reach the summit's crest,`
			`And gazed upon the world below,`
			`And saw the beauty of the night's best,`
			`And felt the peace, that only nature knows.`

			`Oh, brave adventurer, who dared to climb`
			`The lofty peak of Mt. Fuji in the night,`
			`Thou art a hero, in the eyes of all,`
			`For thou didst conquer this mountain, so bright.`
			```

			`[gritlm]: https://github.com/ContextualAI/gritlm`