Olivier Chafik
92c384a5e8
nits
2024-10-29 17:24:59 +00:00
Olivier Chafik
773ff91b7a
tool-call
: force printing of lazy grammar trigger tokens to regularize function call parsing
2024-10-29 15:26:51 +00:00
Olivier Chafik
fa4c1119c9
tool-call
: use functionary-small-v3.2-Q8_0.gguf in test (Q4_K_M too dumb for function call)
2024-10-29 15:25:37 +00:00
Olivier Chafik
64287a328d
tool-call
: test Hermes-3-Llama-3.1-8B
2024-10-29 14:52:25 +00:00
ochafik
aefac1e5cb
tool-call
: update scripts/fetch_server_test_models.py
2024-10-28 23:57:23 +00:00
ochafik
b825440c81
tool-call
: use Q4_K_M models
2024-10-28 23:56:40 +00:00
ochafik
74d71a673e
agent
: simplify syntax (default tools to local w/ default port)
2024-10-28 23:54:01 +00:00
ochafik
b51c71c734
tool-call
: remove duplicate script to fetch templates
2024-10-28 21:35:18 +00:00
ochafik
ec547e4137
tool-call
: add tests: tool_call=none, parallel_tool_calls=true
2024-10-28 10:04:00 +00:00
ochafik
168add7ec8
Update tool_call.feature
2024-10-28 02:06:00 +00:00
ochafik
dd6d0241a7
tool-call
: script to prefetch models used in server tests
2024-10-28 02:01:00 +00:00
ochafik
7fde6d0091
tool_call
: test no tool call on a real model + rename scenarios
2024-10-28 02:00:09 +00:00
ochafik
c88095e3fc
space nits
2024-10-28 00:27:04 +00:00
ochafik
9a86ea79a2
tool-call
: slow tool call integration tests
2024-10-28 00:26:40 +00:00
ochafik
ec9f3b101b
nits
2024-10-27 16:44:54 +00:00
ochafik
080982ebf3
tool-call
: test MistralNemo in forced tools server tests (w/ parallel tool calls disabled)
2024-10-27 16:39:51 +00:00
Olivier Chafik
30bd00bcf7
agent
: fix tools setup
2024-10-25 02:00:47 +01:00
Olivier Chafik
5c414a3335
agent
: simplify tools setup
2024-10-25 01:03:45 +01:00
Olivier Chafik
0f4fc8cb28
agent
: fix no-cache issue in squid for brave tool
2024-10-24 18:59:37 +01:00
Olivier Chafik
03b86416e1
agent
: fix deps + make docker compose setup easier to debug
2024-10-24 12:30:27 +01:00
ochafik
c2926e4bd9
Update README.md
2024-10-24 06:40:16 +01:00
ochafik
d338bfb87f
agent
: ditch aiohttp & define REQUESTS_CA_BUNDLE to fix http proxying / trust the self-signed cert from python
2024-10-24 06:35:37 +01:00
ochafik
0f5d63943f
agent
: display http errors nicely
2024-10-24 05:40:58 +01:00
ochafik
f5320af02a
tool-call
: return tool_call.id (required by Nemo)
2024-10-24 05:40:15 +01:00
ochafik
267e630c14
agent
: isolate tools container + log its outgoing HTTP & HTTPS traffic w/ docker compose + self-signed squid proxy
2024-10-24 05:38:54 +01:00
ochafik
414f6f1b30
Merge branch 'tool-call' of github.com:ochafik/llama.cpp into tool-call
2024-10-23 21:22:08 +01:00
ochafik
4394e1cd5e
Update tool-call.cpp
2024-10-23 21:21:39 +01:00
Olivier Chafik
5f4aef10ba
Merge remote-tracking branch 'origin/master' into tool-call
2024-10-23 11:28:28 +01:00
ochafik
2b49440011
tool-call
: fix previous commit's parallel arg
2024-10-23 02:35:21 +01:00
ochafik
3e12b9b38e
tool-calls
: basic Nemo support, default parallel to true if template mentions tool_call_id
2024-10-23 02:30:31 +01:00
github-actions[bot]
873279b159
flake.lock: Update
...
Nix CI / nix-eval (macos-latest) (push) Waiting to run
Nix CI / nix-eval (ubuntu-latest) (push) Waiting to run
Nix CI / nix-build (macos-latest) (push) Waiting to run
Nix CI / nix-build (ubuntu-latest) (push) Waiting to run
flake8 Lint / Lint (push) Waiting to run
Nix aarch64 builds / nix-build-aarch64 (push) Has been cancelled
Flake lock file updates:
• Updated input 'nixpkgs':
'github:NixOS/nixpkgs/5633bcff0c6162b9e4b5f1264264611e950c8ec7?narHash=sha256-9UTxR8eukdg%2BXZeHgxW5hQA9fIKHsKCdOIUycTryeVw%3D' (2024-10-09)
→ 'github:NixOS/nixpkgs/4c2fcb090b1f3e5b47eaa7bd33913b574a11e0a0?narHash=sha256-/uilDXvCIEs3C9l73JTACm4quuHUsIHcns1c%2BcHUJwA%3D' (2024-10-18)
2024-10-23 01:28:07 +00:00
ochafik
fc80ad20ce
tool-call
: Log tool call style name, ensure returned content not null
2024-10-22 23:41:47 +01:00
ochafik
a4f12a4594
minja
: fix string subscripts, add string pipe to support Mistral-Nemo template
2024-10-22 23:39:46 +01:00
Xuan Son Nguyen
c8c07d658a
llama : fix empty batch causing llama_batch_allocr to crash ( #9966 )
...
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/full-cuda.Dockerfile platforms:linux/amd64 tag:full-cuda]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/full-musa.Dockerfile platforms:linux/amd64 tag:full-musa]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/full.Dockerfile platforms:linux/amd64,linux/arm64 tag:full]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-cli-cuda.Dockerfile platforms:linux/amd64 tag:light-cuda]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-cli-intel.Dockerfile platforms:linux/amd64 tag:light-intel]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-cli-musa.Dockerfile platforms:linux/amd64 tag:light-musa]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-cli.Dockerfile platforms:linux/amd64,linux/arm64 tag:light]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-server-cuda.Dockerfile platforms:linux/amd64 tag:server-cuda]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-server-intel.Dockerfile platforms:linux/amd64 tag:server-intel]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-server-musa.Dockerfile platforms:linux/amd64 tag:server-musa]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-server.Dockerfile platforms:linux/amd64,linux/arm64 tag:server]) (push) Waiting to run
Nix CI / nix-eval (macos-latest) (push) Waiting to run
Nix CI / nix-eval (ubuntu-latest) (push) Waiting to run
Nix CI / nix-build (macos-latest) (push) Waiting to run
Nix CI / nix-build (ubuntu-latest) (push) Waiting to run
flake8 Lint / Lint (push) Waiting to run
Python check requirements.txt / check-requirements (push) Has been cancelled
Python Type-Check / pyright type-check (push) Has been cancelled
* llama : fix empty batch cause llama_batch_allocr to crash
* move batch_allocr inside decode/encode_internal
* fix build
* add GGML_ASSERT
* Apply suggestions from code review
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-10-22 16:59:02 +02:00
Olivier Chafik
351aecbe3f
Update llama-sampling.cpp
2024-10-22 14:37:43 +01:00
Olivier Chafik
db4bf93812
Merge remote-tracking branch 'origin/master' into tool-call
2024-10-22 14:37:30 +01:00
Daniel Bevenius
19d900a756
llama : rename batch to ubatch ( #9950 )
...
This commit renames the member field batch in llm_build_context to
ubatch, and also the parameter batch in llama_build_graph, and
llama_set_inputs to ubatch.
The motivation for this change is to make the code more readable
(considering there are the structs llama_batch and llama_sbatch), and
consistent with other parts of the code base where parameters/fields of
type llama_ubatch are named ubatch.
2024-10-22 16:31:06 +03:00
Molly Sophia
11d47057a5
Rwkv chat template fix ( #10001 )
...
* llama: remove useless template matching for rwkv-world
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
* converter: Add comment about the hack for rwkv models
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
* Update src/llama.cpp
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
---------
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
2024-10-22 15:22:26 +02:00
Xuan Son Nguyen
c421ac072d
lora : warn user if new token is added in the adapter ( #9948 )
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/full-cuda.Dockerfile platforms:linux/amd64 tag:full-cuda]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/full-musa.Dockerfile platforms:linux/amd64 tag:full-musa]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/full.Dockerfile platforms:linux/amd64,linux/arm64 tag:full]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-cli-cuda.Dockerfile platforms:linux/amd64 tag:light-cuda]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-cli-intel.Dockerfile platforms:linux/amd64 tag:light-intel]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-cli-musa.Dockerfile platforms:linux/amd64 tag:light-musa]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-cli.Dockerfile platforms:linux/amd64,linux/arm64 tag:light]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-server-cuda.Dockerfile platforms:linux/amd64 tag:server-cuda]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-server-intel.Dockerfile platforms:linux/amd64 tag:server-intel]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-server-musa.Dockerfile platforms:linux/amd64 tag:server-musa]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-server.Dockerfile platforms:linux/amd64,linux/arm64 tag:server]) (push) Waiting to run
Nix CI / nix-eval (macos-latest) (push) Waiting to run
Nix CI / nix-eval (ubuntu-latest) (push) Waiting to run
Nix CI / nix-build (macos-latest) (push) Waiting to run
Nix CI / nix-build (ubuntu-latest) (push) Waiting to run
Python check requirements.txt / check-requirements (push) Waiting to run
flake8 Lint / Lint (push) Waiting to run
Python Type-Check / pyright type-check (push) Waiting to run
2024-10-22 13:08:41 +02:00
Olivier Chafik
7f2429e6b0
tool-calls
: fix grammar regression
2024-10-22 11:49:50 +01:00
Molly Sophia
4ff7fe1fb3
llama : add chat template for RWKV-World + fix EOT ( #9968 )
...
* Add chat template for RWKV-World
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
* RWKV: Fix the chat template not being used
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
* RWKV v6: Set EOT token to ``\n\n``
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
* readme: add rwkv into supported model list
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
---------
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-10-22 13:33:37 +03:00
ochafik
b53362a148
Update test-tool-call.cpp
2024-10-22 10:54:48 +01:00
ochafik
9f5ab97756
tool-calls
: add generic tool call style as default
2024-10-22 10:53:21 +01:00
ochafik
fa8462ffd3
fix root
2024-10-22 10:53:01 +01:00
ochafik
75764871e6
tool-call
: fix grammar roots
2024-10-22 10:50:52 +01:00
leo-pony
6b8447352d
[CANN] Adapt to dynamically loadable backends mechanism ( #9970 )
...
* [CANN] Adapt to dynamically loadable backends mechanism
* Fix the Bug: inference running result is garbled in debug running model for LM models who's type is Q4_0 class
* Handle the review comments of this pull request
2024-10-22 16:16:01 +08:00
Daniel Bevenius
674804a996
arg : fix typo in embeddings argument help [no ci] ( #9994 )
...
This commit fixes two typos in the help text for the `--embd-normalize`
and `--embd-separator` arguments. It also updates common.h which contain
the same typo in two comments.
2024-10-22 10:40:02 +03:00
Georgi Gerganov
e94a138d64
llama.vim : fix info text display [no ci] ( #9787 )
2024-10-22 00:37:55 +03:00
Georgi Gerganov
e01c67affe
llama.vim : move info to the right of screen [no ci] ( #9787 )
...
'eol' messes up the rendering with nvim v0.10.2 for some reason
2024-10-21 22:53:18 +03:00
Asghar Ghorbani
994cfb1acb
readme : update UI list ( #9972 )
...
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/full-cuda.Dockerfile platforms:linux/amd64 tag:full-cuda]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/full-musa.Dockerfile platforms:linux/amd64 tag:full-musa]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/full.Dockerfile platforms:linux/amd64,linux/arm64 tag:full]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-cli-cuda.Dockerfile platforms:linux/amd64 tag:light-cuda]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-cli-intel.Dockerfile platforms:linux/amd64 tag:light-intel]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-cli-musa.Dockerfile platforms:linux/amd64 tag:light-musa]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-cli.Dockerfile platforms:linux/amd64,linux/arm64 tag:light]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-server-cuda.Dockerfile platforms:linux/amd64 tag:server-cuda]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-server-intel.Dockerfile platforms:linux/amd64 tag:server-intel]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-server-musa.Dockerfile platforms:linux/amd64 tag:server-musa]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-server.Dockerfile platforms:linux/amd64,linux/arm64 tag:server]) (push) Waiting to run
Nix CI / nix-eval (macos-latest) (push) Waiting to run
Nix CI / nix-eval (ubuntu-latest) (push) Waiting to run
Nix CI / nix-build (macos-latest) (push) Waiting to run
Nix CI / nix-build (ubuntu-latest) (push) Waiting to run
flake8 Lint / Lint (push) Waiting to run
add PocketPal AI app
2024-10-21 21:20:59 +03:00