Commit Graph

183 Commits

Author SHA1 Message Date
Djip007
2cd43f4900
ggml : more perfo with llamafile tinyblas on x86_64 (#10714)
Some checks are pending
flake8 Lint / Lint (push) Waiting to run
Python Type-Check / pyright type-check (push) Waiting to run
* more perfo with llamafile tinyblas on x86_64.

- add bf16 suport
- change dispache strategie (thanks:
https://github.com/ikawrakow/ik_llama.cpp/pull/71 )
- reduce memory bandwidth

simple tinyblas dispache and more cache freindly

* tinyblas dynamic dispaching

* sgemm: add M blocs.

* - git 2.47 use short id of len 9.
- show-progress is not part of GNU Wget2

* remove not stable test
2024-12-24 18:54:49 +01:00
Georgi Gerganov
5437d4aaf5
sync : ggml 2024-12-17 18:36:02 +02:00
Georgi Gerganov
87cf323cef
scripts : change build path to "build-bench" for compare-commits.sh (#10836) 2024-12-15 18:44:47 +02:00
Georgi Gerganov
0cd182ebcc
sync : ggml 2024-12-05 13:27:42 +02:00
Diego Devesa
59f4db1088
ggml : add predefined list of CPU backend variants to build (#10626)
Some checks failed
flake8 Lint / Lint (push) Waiting to run
Python Type-Check / pyright type-check (push) Waiting to run
Python check requirements.txt / check-requirements (push) Has been cancelled
* ggml : add predefined list of CPU backend variants to build

* update CPU dockerfiles
2024-12-04 14:45:40 +01:00
Georgi Gerganov
1cd3df46bd scripts : remove amx sync
ggml-ci
2024-12-03 20:04:49 +02:00
Georgi Gerganov
c505471857 sync : ggml 2024-12-03 20:04:49 +02:00
Georgi Gerganov
8648c52101
make : deprecate (#10514)
* make : deprecate

ggml-ci

* ci : disable Makefile builds

ggml-ci

* docs : remove make references [no ci]

* ci : disable swift build

ggml-ci

* docs : remove obsolete make references, scripts, examples

ggml-ci

* basic fix for compare-commits.sh

* update build.md

* more build.md updates

* more build.md updates

* more build.md updates

* Update Makefile

Co-authored-by: Diego Devesa <slarengh@gmail.com>

---------

Co-authored-by: slaren <slarengh@gmail.com>
2024-12-02 21:22:53 +02:00
Diego Devesa
3420909dff
ggml : automatic selection of best CPU backend (#10606)
* ggml : automatic selection of best CPU backend

* amx : minor opt

* add GGML_AVX_VNNI to enable avx-vnni, fix checks
2024-12-01 16:12:41 +01:00
Georgi Gerganov
fee824a1a1
sync : ggml 2024-11-27 11:10:42 +02:00
Georgi Gerganov
87a533be57
sync : ggml 2024-11-21 09:22:11 +02:00
Georgi Gerganov
9fe0fb0626 sync : ggml 2024-11-19 20:03:21 +02:00
Georgi Gerganov
5c9a8b22b1 scripts : update sync 2024-11-17 08:30:29 +02:00
Johannes Gäßler
4e54be0ec6
llama/ex: remove --logdir argument (#10339) 2024-11-16 23:00:41 +01:00
Georgi Gerganov
f245cc28d4
scripts : fix missing key in compare-llama-bench.py (#10332) 2024-11-16 10:32:50 +02:00
Johannes Gäßler
4047be74da
scripts: update compare-llama-bench.py (#10319) 2024-11-15 21:19:03 +01:00
Georgi Gerganov
cbf5541a82 sync : ggml 2024-11-15 15:44:06 +02:00
Georgi Gerganov
4802ad350b
scripts : fix regex in sync [no ci] 2024-11-15 08:38:43 +02:00
Georgi Gerganov
5ea926dad7
sync : ggml 2024-11-13 18:11:54 +02:00
Georgi Gerganov
eec4d71737
scripts : add amx to sync-ggml.sh [no ci] 2024-11-07 23:11:36 +02:00
Georgi Gerganov
3b08828674
sync : ggml 2024-11-07 23:08:24 +02:00
Georgi Gerganov
a2c6fd747c
scripts : sync update 2024-11-07 23:07:55 +02:00
Georgi Gerganov
ce027adfb3
sync : ggml
Some checks are pending
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/full-cuda.Dockerfile platforms:linux/amd64 tag:full-cuda]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/full-musa.Dockerfile platforms:linux/amd64 tag:full-musa]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/full.Dockerfile platforms:linux/amd64,linux/arm64 tag:full]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-cli-cuda.Dockerfile platforms:linux/amd64 tag:light-cuda]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-cli-intel.Dockerfile platforms:linux/amd64 tag:light-intel]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-cli-musa.Dockerfile platforms:linux/amd64 tag:light-musa]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-cli.Dockerfile platforms:linux/amd64,linux/arm64 tag:light]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-server-cuda.Dockerfile platforms:linux/amd64 tag:server-cuda]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-server-intel.Dockerfile platforms:linux/amd64 tag:server-intel]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-server-musa.Dockerfile platforms:linux/amd64 tag:server-musa]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-server.Dockerfile platforms:linux/amd64,linux/arm64 tag:server]) (push) Waiting to run
Nix CI / nix-eval (macos-latest) (push) Waiting to run
Nix CI / nix-eval (ubuntu-latest) (push) Waiting to run
Nix CI / nix-build (macos-latest) (push) Waiting to run
Nix CI / nix-build (ubuntu-latest) (push) Waiting to run
flake8 Lint / Lint (push) Waiting to run
2024-11-04 10:33:37 +02:00
Georgi Gerganov
815fe72adc
sync : ggml
Some checks are pending
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/full-cuda.Dockerfile platforms:linux/amd64 tag:full-cuda]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/full-musa.Dockerfile platforms:linux/amd64 tag:full-musa]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/full.Dockerfile platforms:linux/amd64,linux/arm64 tag:full]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-cli-cuda.Dockerfile platforms:linux/amd64 tag:light-cuda]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-cli-intel.Dockerfile platforms:linux/amd64 tag:light-intel]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-cli-musa.Dockerfile platforms:linux/amd64 tag:light-musa]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-cli.Dockerfile platforms:linux/amd64,linux/arm64 tag:light]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-server-cuda.Dockerfile platforms:linux/amd64 tag:server-cuda]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-server-intel.Dockerfile platforms:linux/amd64 tag:server-intel]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-server-musa.Dockerfile platforms:linux/amd64 tag:server-musa]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-server.Dockerfile platforms:linux/amd64,linux/arm64 tag:server]) (push) Waiting to run
Nix CI / nix-eval (macos-latest) (push) Waiting to run
Nix CI / nix-eval (ubuntu-latest) (push) Waiting to run
Nix CI / nix-build (macos-latest) (push) Waiting to run
Nix CI / nix-build (ubuntu-latest) (push) Waiting to run
flake8 Lint / Lint (push) Waiting to run
2024-11-01 10:28:24 +02:00
Diego Devesa
c5b0f4b5d9
llama : refactor model loader with backend registry (#10026) 2024-10-30 02:01:23 +01:00
Georgi Gerganov
8d8ff71536
llama : remove Tail-Free sampling (#10071)
ggml-ci
2024-10-29 10:42:05 +02:00
Georgi Gerganov
cc2983d375
sync : ggml
Some checks failed
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/full-cuda.Dockerfile platforms:linux/amd64 tag:full-cuda]) (push) Has been cancelled
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/full-musa.Dockerfile platforms:linux/amd64 tag:full-musa]) (push) Has been cancelled
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/full.Dockerfile platforms:linux/amd64,linux/arm64 tag:full]) (push) Has been cancelled
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-cli-cuda.Dockerfile platforms:linux/amd64 tag:light-cuda]) (push) Has been cancelled
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-cli-intel.Dockerfile platforms:linux/amd64 tag:light-intel]) (push) Has been cancelled
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-cli-musa.Dockerfile platforms:linux/amd64 tag:light-musa]) (push) Has been cancelled
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-cli.Dockerfile platforms:linux/amd64,linux/arm64 tag:light]) (push) Has been cancelled
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-server-cuda.Dockerfile platforms:linux/amd64 tag:server-cuda]) (push) Has been cancelled
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-server-intel.Dockerfile platforms:linux/amd64 tag:server-intel]) (push) Has been cancelled
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-server-musa.Dockerfile platforms:linux/amd64 tag:server-musa]) (push) Has been cancelled
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-server.Dockerfile platforms:linux/amd64,linux/arm64 tag:server]) (push) Has been cancelled
Nix CI / nix-eval (macos-latest) (push) Has been cancelled
Nix CI / nix-eval (ubuntu-latest) (push) Has been cancelled
Nix CI / nix-build (macos-latest) (push) Has been cancelled
Nix CI / nix-build (ubuntu-latest) (push) Has been cancelled
flake8 Lint / Lint (push) Has been cancelled
update-flake-lock / lockfile (push) Has been cancelled
2024-10-26 10:34:08 +03:00
Georgi Gerganov
9e4a2563ea
scripts : fix amx sync [no ci] 2024-10-26 10:33:31 +03:00
Georgi Gerganov
190a37d797
sync : ggml 2024-10-23 17:23:55 +03:00
Georgi Gerganov
17bb928080
readme : remove --memory-f32 references (#9925) 2024-10-17 23:43:05 +03:00
Georgi Gerganov
0e41b300ed
sync : ggml 2024-10-16 11:28:14 +03:00
standby24x7
fa42aa6d89
scripts : fix spelling typo in messages and comments (#9782)
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
2024-10-08 09:19:53 +03:00
Georgi Gerganov
b6d6c5289f
sync : llama.cpp 2024-10-06 12:53:28 +03:00
Georgi Gerganov
58b16695e1
sync : ggml 2024-10-05 15:53:49 +03:00
Georgi Gerganov
17880771ad
sync : ggml 2024-10-04 18:50:25 +03:00
Georgi Gerganov
1bb8a64ebf
sync : ggml 2024-10-03 21:17:49 +03:00
Diego Devesa
c83ad6d01e
ggml-backend : add device and backend reg interfaces (#9707)
Some checks are pending
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/full-cuda.Dockerfile platforms:linux/amd64 tag:full-cuda]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/full.Dockerfile platforms:linux/amd64,linux/arm64 tag:full]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-cli-cuda.Dockerfile platforms:linux/amd64 tag:light-cuda]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-cli-intel.Dockerfile platforms:linux/amd64 tag:light-intel]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-cli.Dockerfile platforms:linux/amd64,linux/arm64 tag:light]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-server-cuda.Dockerfile platforms:linux/amd64 tag:server-cuda]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-server-intel.Dockerfile platforms:linux/amd64 tag:server-intel]) (push) Waiting to run
Publish Docker image / Push Docker image to Docker Hub (map[dockerfile:.devops/llama-server.Dockerfile platforms:linux/amd64,linux/arm64 tag:server]) (push) Waiting to run
Nix CI / nix-eval (macos-latest) (push) Waiting to run
Nix CI / nix-eval (ubuntu-latest) (push) Waiting to run
Nix CI / nix-build (macos-latest) (push) Waiting to run
Nix CI / nix-build (ubuntu-latest) (push) Waiting to run
flake8 Lint / Lint (push) Waiting to run
Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
2024-10-03 01:49:47 +02:00
Georgi Gerganov
f1b8c42711
sync : ggml 2024-10-01 16:09:42 +03:00
Georgi Gerganov
d0b1d663e4
sync : ggml 2024-09-29 21:16:07 +03:00
Georgi Gerganov
bb5f819975
sync : ggml 2024-09-24 11:01:18 +03:00
Georgi Gerganov
4301535326 sync : ggml
ggml-ci
2024-09-20 21:15:05 +03:00
Georgi Gerganov
0d2f22e45c
scripts : verify py deps at the start of compare (#9520) 2024-09-18 18:34:32 +03:00
Georgi Gerganov
385decbd63 sync : ggml 2024-09-08 11:05:55 +03:00
Georgi Gerganov
60a3107ccd scripts : option to increase git patch context 2024-09-08 11:05:55 +03:00
Georgi Gerganov
231cff5f6f sync : ggml 2024-08-27 22:41:27 +03:00
Georgi Gerganov
4305b57c80
sync : ggml 2024-08-09 10:03:48 +03:00
Georgi Gerganov
afd27f01fe
scripts : sync cann files (#0) 2024-08-08 14:56:52 +03:00
Georgi Gerganov
366d486c16
scripts : fix sync filenames (#0) 2024-08-08 14:40:12 +03:00
Georgi Gerganov
e44a561ab0
sync : ggml 2024-08-08 13:19:47 +03:00
Georgi Gerganov
5587e57a76 sync : ggml
ggml-ci
2024-08-05 08:50:57 +03:00