Olivier Chafik
|
366efc8a18
|
tool-call : fix llama 3.x tc parsing when there are spaces before "name"
|
2024-10-03 21:46:41 +01:00 |
|
Olivier Chafik
|
ece12b074f
|
antiprompts : ensure partial match is at end of string (or else server stops sending replies)
|
2024-10-03 19:23:08 +01:00 |
|
Olivier Chafik
|
9e502e89a5
|
tool-call : promote getting chat templates w/ dedicated script rather than rely on test resources
|
2024-10-02 15:03:08 +01:00 |
|
Olivier Chafik
|
c76b14501e
|
tool-call : fix Makefile
|
2024-10-02 00:06:42 +01:00 |
|
Olivier Chafik
|
c36a196f53
|
tool-call : prepare possible externalization of minja + factor tool call style out of template
|
2024-10-01 23:12:24 +01:00 |
|
ochafik
|
d9451fd647
|
antiprompts : avoid c++20 struct initializers in test
|
2024-09-30 04:08:55 +01:00 |
|
ochafik
|
0fc5ad7ae1
|
minja : avoid c++20 struct initializers in test
|
2024-09-30 03:51:48 +01:00 |
|
ochafik
|
9ac4b04aa2
|
tool-call : add fs_list_files to common, w/ win32 impl for msys2 build
|
2024-09-29 00:42:52 +01:00 |
|
ochafik
|
cb7912ee74
|
chat-template : add phi-3.5-vision-instruct
|
2024-09-29 00:33:19 +01:00 |
|
ochafik
|
c87c12168a
|
tool-call : fix memory leak in test
|
2024-09-28 23:44:28 +01:00 |
|
ochafik
|
22493c8e9e
|
tests : fix test-chat-template run from build
|
2024-09-28 23:31:23 +01:00 |
|
ochafik
|
ad6719e2a7
|
tests : fix typo
|
2024-09-28 23:26:19 +01:00 |
|
ochafik
|
a072f30a8d
|
tests : attempt to find assets for tests run from build subfolder
|
2024-09-28 23:15:36 +01:00 |
|
ochafik
|
bc3e0c0830
|
tool-call : Qwen 2.5 Instruct also requires object arguments
|
2024-09-28 23:05:35 +01:00 |
|
ochafik
|
dbda025f87
|
tool-call : test messages -> template -> grammar -> tool call parser
|
2024-09-28 22:32:47 +01:00 |
|
ochafik
|
9358d1f62c
|
minja : fix gcc8 build of test
|
2024-09-28 19:50:08 +01:00 |
|
ochafik
|
6e0053a81b
|
chat-template : enumerate files w/ C API rather than private using std::__fs::filesystem
|
2024-09-28 18:47:11 +01:00 |
|
ochafik
|
c657857e21
|
tool-call : cleanup tools.py
|
2024-09-28 18:33:40 +01:00 |
|
ochafik
|
d983516f40
|
tool-call : let the tool call handler expand chat template, moving builtin_tools down as extra_context
|
2024-09-28 17:46:36 +01:00 |
|
ochafik
|
0c85bc7a8f
|
tool-call : test tool call style detection
|
2024-09-28 17:43:09 +01:00 |
|
ochafik
|
887951beb0
|
minja : generate chat goldens w/ fixed date to support Llama-3.2-3B-Instruct (uses strftime_now)
|
2024-09-27 19:52:15 +01:00 |
|
ochafik
|
701b664551
|
minja : add indent filter to support command-r-plus's chat templates
|
2024-09-27 19:00:14 +01:00 |
|
ochafik
|
0093a5e527
|
minja : fix identifiers parsing (when start w/ not/is/etc) and lstrip_blocks corner case (needed by DeepSeek-V2.5
|
2024-09-27 18:30:44 +01:00 |
|
ochafik
|
1e5c0e747e
|
chat-template : fix jinja tests (make safe a passthrough)
|
2024-09-27 03:50:04 +01:00 |
|
ochafik
|
f9c1743bb5
|
minja : fix iterables
|
2024-09-27 03:36:49 +01:00 |
|
ochafik
|
10f9fe8d49
|
tool-call : fix tool call return format
|
2024-09-26 21:01:04 +01:00 |
|
ochafik
|
2926089c5d
|
fix lints
|
2024-09-26 19:06:29 +01:00 |
|
ochafik
|
5840e10069
|
tool-call : merge & fix jinja template tests into test-chat-template
|
2024-09-26 19:05:00 +01:00 |
|
ochafik
|
50685f837f
|
minja : add str.title()
|
2024-09-26 19:03:59 +01:00 |
|
ochafik
|
296331bba3
|
minja : update chat template goldens w/ llama.3.1 arguments workaround
|
2024-09-26 18:10:27 +01:00 |
|
ochafik
|
cf7bece6a7
|
tool-call : factor chat template away from legacy API
|
2024-09-26 17:19:29 +01:00 |
|
ochafik
|
0c870133d8
|
tool-call : test/fix functionary-medium-v3.1's template (can "look" like llama3.1 template)
|
2024-09-26 05:56:15 +01:00 |
|
ochafik
|
8e4a9bad8a
|
minja : allow none input to selectattr, and add safe passthrough filter
|
2024-09-26 05:53:12 +01:00 |
|
ochafik
|
2eb29bf8b8
|
tool-call : update chat templates/goldens
|
2024-09-26 04:00:10 +01:00 |
|
ochafik
|
4cd82d61dd
|
tool-call : fix pyright type errors
|
2024-09-26 03:59:38 +01:00 |
|
ochafik
|
595e11cb11
|
tool-call : fix/test functionary v3
|
2024-09-26 03:42:05 +01:00 |
|
ochafik
|
c124ab48ea
|
minja : add str.endswith
|
2024-09-26 03:21:23 +01:00 |
|
ochafik
|
76d2938ef8
|
fix flake8 lints
|
2024-09-26 02:30:17 +01:00 |
|
ochafik
|
1b6280102b
|
fix editorconfig lints
|
2024-09-26 02:27:46 +01:00 |
|
ochafik
|
e983c9d0de
|
tool-call : fix llama_chat_apply_template signature / test-chat-template
|
2024-09-25 22:02:58 +01:00 |
|
ochafik
|
97d0620968
|
minja : fetch more templates (add models from test-chat-template)
|
2024-09-25 19:22:43 +01:00 |
|
ochafik
|
4706bdbae1
|
tool-call : support Functionary v3 vs. v3-llama3.1 variants
|
2024-09-25 17:33:00 +01:00 |
|
ochafik
|
e309c6a47f
|
tool-call : integrate minja & tool-call to server when --jinja is set
|
2024-09-25 16:14:46 +01:00 |
|
ochafik
|
3cfc21ea71
|
tool-call : basic Functionary 3.2, Llama 3.1, Hermes 2 Pro grammar generators + parsers
|
2024-09-25 16:14:22 +01:00 |
|
ochafik
|
eaca756ecc
|
minja : minimalist Jinja templating engine for LLM chat templates
|
2024-09-25 16:14:22 +01:00 |
|
ochafik
|
5b6d5040d5
|
grammar : trigger words + refactor of antiprompts
|
2024-09-25 16:14:22 +01:00 |
|
Georgi Gerganov
|
b0f27361f3
|
sampling : avoid expensive softmax during greedy sampling (#9605)
* sampling : avoid expensive softmax during greedy sampling
ggml-ci
* speculative : fix default RNG seed + set sparams.n_probs
* Update tests/test-sampling.cpp
Co-authored-by: slaren <slarengh@gmail.com>
* sampling : add clarifying comment [no ci]
---------
Co-authored-by: slaren <slarengh@gmail.com>
|
2024-09-24 09:03:17 +03:00 |
|
Johannes Gäßler
|
a5b57b08ce
|
CUDA: enable Gemma FA for HIP/Pascal (#9581)
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/full-cuda.Dockerfile platforms:linux/amd64 tag:full-cuda]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/full.Dockerfile platforms:linux/amd64,linux/arm64 tag:full]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-cli-cuda.Dockerfile platforms:linux/amd64 tag:light-cuda]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-cli-intel.Dockerfile platforms:linux/amd64 tag:light-intel]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-cli.Dockerfile platforms:linux/amd64,linux/arm64 tag:light]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-server-cuda.Dockerfile platforms:linux/amd64 tag:server-cuda]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-server-intel.Dockerfile platforms:linux/amd64 tag:server-intel]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-server.Dockerfile platforms:linux/amd64,linux/arm64 tag:server]) (push) Waiting to run
Nix CI / nix-eval (macos-latest) (push) Waiting to run
Nix CI / nix-eval (ubuntu-latest) (push) Waiting to run
Nix CI / nix-build (macos-latest) (push) Waiting to run
Nix CI / nix-build (ubuntu-latest) (push) Waiting to run
flake8 Lint / Lint (push) Waiting to run
|
2024-09-22 09:34:52 +02:00 |
|
Molly Sophia
|
2a63caaa69
|
RWKV v6: RWKV_WKV op CUDA implementation (#9454)
* ggml: CUDA unary op EXP
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
* ggml: rwkv_wkv op CUDA impl
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
---------
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
|
2024-09-22 04:29:12 +02:00 |
|
Johannes Gäßler
|
424c5d00a9
|
ggml/examples: add backend support for numerical optimization (ggml/949)
* CUDA eval works
* stochastic gradient descent op
* Adam except decay
* CUDA CROSS_ENTROPY_LOSS_BACK
* CUDA mnist-fc training works
* backend CLI arg
* refactor gguf load
* remove sched from opt_step_adam
* implement l1 regularization (weight decay)
* extra call to add optimizer
* initialize gradients with ggml_graph_reset
* gradient accumulation
* increment iter per eval instead of epoch
* adjust backend interfaces
* fix ggml_graph_reset without backend
* fix ggml graph export/import
* fixup
* rename
* revert ggml_opt changes
* more general CUDA repeat_back
* update documentation, fix CNN
* validation split
* add clarifying comment
* optimize PyTorch training
* adjust buffer size, thread count
* fix 0.0f validation split
* Update examples/mnist/mnist-common.cpp
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* fix gradient accumulation
* tensor flag for accumulators -> tensor hash set
* Update include/ggml.h
Co-authored-by: slaren <slarengh@gmail.com>
* Update tests/test-backend-ops.cpp
Co-authored-by: slaren <slarengh@gmail.com>
* Update tests/test-backend-ops.cpp
Co-authored-by: slaren <slarengh@gmail.com>
* fix test prints
* Update src/ggml-backend.c
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* better CUDA support for noncontiguous out_prod
* add comment
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: slaren <slarengh@gmail.com>
|
2024-09-20 21:15:05 +03:00 |
|