Olivier Chafik
2428b73853
agent
: ditch openai dependency, use cache_prompt and expose seed
2024-10-02 16:26:45 +01:00
Olivier Chafik
b559d64ecc
Update README.md
2024-10-02 15:19:27 +01:00
Olivier Chafik
9e502e89a5
tool-call
: promote getting chat templates w/ dedicated script rather than rely on test resources
2024-10-02 15:03:08 +01:00
Olivier Chafik
f3538e755b
update tools
2024-10-02 14:57:25 +01:00
Olivier Chafik
5b01402655
agent
: add brave_search & fetch_page tools + move to examples/agent/tools/
2024-10-02 14:29:45 +01:00
Olivier Chafik
c36a196f53
tool-call
: prepare possible externalization of minja + factor tool call style out of template
2024-10-01 23:12:24 +01:00
ochafik
dbda025f87
tool-call
: test messages -> template -> grammar -> tool call parser
2024-09-28 22:32:47 +01:00
ochafik
0ae1112faa
agent
: try to fix pyright lint
2024-09-28 20:10:08 +01:00
ochafik
ef2a020276
tool-call
: make agent async
2024-09-28 19:11:09 +01:00
ochafik
c657857e21
tool-call
: cleanup tools.py
2024-09-28 18:33:40 +01:00
ochafik
55cf337560
tool-call
: better error reporting for server tests
2024-09-28 18:33:40 +01:00
ochafik
7cef90cf9c
tool-call
: more eager function call parsing for Functionary & Llama (give a chance to 3B model)
2024-09-28 18:33:40 +01:00
ochafik
8b2cf3509f
tool-call
: fix grammar trigger crash
2024-09-28 18:30:01 +01:00
ochafik
d983516f40
tool-call
: let the tool call handler expand chat template, moving builtin_tools down as extra_context
2024-09-28 17:46:36 +01:00
ochafik
2f25ee30ef
Update README.md
2024-09-27 07:18:07 +01:00
ochafik
86e4f99092
Update README.md
2024-09-27 07:15:25 +01:00
ochafik
e62b5de3cf
tool-call
: fix functionary-small-3.2 (first tool starts w/ name\n, subsequent are >>>name\n)
2024-09-27 07:06:33 +01:00
ochafik
e33b342da7
tool-call
: fix passing of tools to template + allow agent to finish
2024-09-27 06:24:22 +01:00
ochafik
f62e688387
tool-call
: fix crash / test non-tool call case (added llama_sampler_is_grammar_empty)
2024-09-27 06:04:41 +01:00
ochafik
0abfa36ca7
tool-call
: move usage examples to examples/agent
2024-09-27 05:10:30 +01:00
ochafik
6610ecf965
server
: rm bad debug code
2024-09-27 04:07:35 +01:00
ochafik
9295ca95db
tool-call
: fix agent type lints
2024-09-27 03:53:56 +01:00
ochafik
8299fac07c
tool-call
: adapt very simple agent + docker isolation from https://github.com/ggerganov/llama.cpp/pull/6389
2024-09-26 21:07:46 +01:00
ochafik
10f9fe8d49
tool-call
: fix tool call return format
2024-09-26 21:01:04 +01:00
ochafik
c88c932d98
fix gcc error + lint
2024-09-26 19:18:40 +01:00
ochafik
9cfe4d7202
tool-call
: refactor llama_chat_template class + use in validate_model_chat_template
2024-09-26 18:06:03 +01:00
ochafik
cf7bece6a7
tool-call
: factor chat template away from legacy API
2024-09-26 17:19:29 +01:00
ochafik
3d2650ce65
fix gcc build
2024-09-26 06:52:34 +01:00
ochafik
0c870133d8
tool-call
: test/fix functionary-medium-v3.1's template (can "look" like llama3.1 template)
2024-09-26 05:56:15 +01:00
ochafik
4cd82d61dd
tool-call
: fix pyright type errors
2024-09-26 03:59:38 +01:00
ochafik
94377d743c
server
: catch errors in format_final_response_oaicompat instead of taking server down
2024-09-26 03:42:36 +01:00
ochafik
595e11cb11
tool-call
: fix/test functionary v3
2024-09-26 03:42:05 +01:00
ochafik
1b6280102b
fix editorconfig lints
2024-09-26 02:27:46 +01:00
ochafik
ab25e3fbf9
tool-call
: allow empty message content when there's tool_calls in format_chat
2024-09-26 02:19:04 +01:00
ochafik
d928ff4dfd
server
: catch errors in oaicompat_completion_params_parse instead of taking server down
2024-09-26 02:18:01 +01:00
ochafik
a774093a99
tool-call
: add server tests for llama 3.1
2024-09-26 02:17:30 +01:00
ochafik
9e366b3d03
server
: fix tailing comma in completions_seed
2024-09-26 02:15:48 +01:00
ochafik
45b243b4a5
minja
: fix llama_chat_apply_template + adde use_jinja param to validate_model_chat_template
2024-09-26 02:14:42 +01:00
ochafik
e983c9d0de
tool-call
: fix llama_chat_apply_template signature / test-chat-template
2024-09-25 22:02:58 +01:00
ochafik
d15dcfb09d
tool-call
: add output example to readme
2024-09-25 19:22:16 +01:00
ochafik
33ea20edd1
Merge remote-tracking branch 'origin/master' into tool-call
2024-09-25 18:58:54 +01:00
ochafik
8f25531c44
tool-call
: add basic usage example to server readme
2024-09-25 18:00:31 +01:00
ochafik
e309c6a47f
tool-call
: integrate minja & tool-call to server when --jinja is set
2024-09-25 16:14:46 +01:00
ochafik
5b6d5040d5
grammar
: trigger words + refactor of antiprompts
2024-09-25 16:14:22 +01:00
Xuan Son Nguyen
afbbfaa537
server : add more env vars, improve gen-docs ( #9635 )
...
* server : add more env vars, improve gen-docs
* update server docs
* LLAMA_ARG_NO_CONTEXT_SHIFT
2024-09-25 14:05:13 +02:00
Georgi Gerganov
cea1486ecf
log : add CONT level for continuing previous log entry ( #9610 )
2024-09-24 10:15:35 +03:00
StrangeBytesDev
0aa15011e3
server : add newline after chat example ( #9616 )
2024-09-24 09:04:39 +03:00
Georgi Gerganov
b0f27361f3
sampling : avoid expensive softmax during greedy sampling ( #9605 )
...
* sampling : avoid expensive softmax during greedy sampling
ggml-ci
* speculative : fix default RNG seed + set sparams.n_probs
* Update tests/test-sampling.cpp
Co-authored-by: slaren <slarengh@gmail.com>
* sampling : add clarifying comment [no ci]
---------
Co-authored-by: slaren <slarengh@gmail.com>
2024-09-24 09:03:17 +03:00
Xuan Son Nguyen
0b3bf966f4
server : add --no-context-shift option ( #9607 )
...
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/full-cuda.Dockerfile platforms:linux/amd64 tag:full-cuda]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/full.Dockerfile platforms:linux/amd64,linux/arm64 tag:full]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-cli-cuda.Dockerfile platforms:linux/amd64 tag:light-cuda]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-cli-intel.Dockerfile platforms:linux/amd64 tag:light-intel]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-cli.Dockerfile platforms:linux/amd64,linux/arm64 tag:light]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-server-cuda.Dockerfile platforms:linux/amd64 tag:server-cuda]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-server-intel.Dockerfile platforms:linux/amd64 tag:server-intel]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-server.Dockerfile platforms:linux/amd64,linux/arm64 tag:server]) (push) Waiting to run
Nix CI / nix-eval (macos-latest) (push) Waiting to run
Nix CI / nix-eval (ubuntu-latest) (push) Waiting to run
Nix CI / nix-build (macos-latest) (push) Waiting to run
Nix CI / nix-build (ubuntu-latest) (push) Waiting to run
flake8 Lint / Lint (push) Waiting to run
Nix aarch64 builds / nix-build-aarch64 (push) Has been cancelled
Python Type-Check / pyright type-check (push) Has been cancelled
* server : add --no-context-shift option
* small fix
* Update examples/server/tests/features/embeddings.feature
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* tests : minor fix
* revert usage of GGML_ASSERT
* update server documentation
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-09-23 22:23:54 +02:00
Georgi Gerganov
37f8c7b4c9
perplexity : remove extra new lines after chunks ( #9596 )
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/full-cuda.Dockerfile platforms:linux/amd64 tag:full-cuda]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/full.Dockerfile platforms:linux/amd64,linux/arm64 tag:full]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-cli-cuda.Dockerfile platforms:linux/amd64 tag:light-cuda]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-cli-intel.Dockerfile platforms:linux/amd64 tag:light-intel]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-cli.Dockerfile platforms:linux/amd64,linux/arm64 tag:light]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-server-cuda.Dockerfile platforms:linux/amd64 tag:server-cuda]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-server-intel.Dockerfile platforms:linux/amd64 tag:server-intel]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-server.Dockerfile platforms:linux/amd64,linux/arm64 tag:server]) (push) Waiting to run
Nix CI / nix-eval (macos-latest) (push) Waiting to run
Nix CI / nix-eval (ubuntu-latest) (push) Waiting to run
Nix CI / nix-build (macos-latest) (push) Waiting to run
Nix CI / nix-build (ubuntu-latest) (push) Waiting to run
flake8 Lint / Lint (push) Waiting to run
2024-09-23 11:28:02 +03:00