Georgi Gerganov
f3f65429c4
llama : reorganize source code + improve CMake ( #8006 )
...
* scripts : update sync [no ci]
* files : relocate [no ci]
* ci : disable kompute build [no ci]
* cmake : fixes [no ci]
* server : fix mingw build
ggml-ci
* cmake : minor [no ci]
* cmake : link math library [no ci]
* cmake : build normal ggml library (not object library) [no ci]
* cmake : fix kompute build
ggml-ci
* make,cmake : fix LLAMA_CUDA + replace GGML_CDEF_PRIVATE
ggml-ci
* move public backend headers to the public include directory (#8122 )
* move public backend headers to the public include directory
* nix test
* spm : fix metal header
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* scripts : fix sync paths [no ci]
* scripts : sync ggml-blas.h [no ci]
---------
Co-authored-by: slaren <slarengh@gmail.com>
2024-06-26 18:33:02 +03:00
Olivier Chafik
1c641e6aac
build
: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809 )
...
* `main`/`server`: rename to `llama` / `llama-server` for consistency w/ homebrew
* server: update refs -> llama-server
gitignore llama-server
* server: simplify nix package
* main: update refs -> llama
fix examples/main ref
* main/server: fix targets
* update more names
* Update build.yml
* rm accidentally checked in bins
* update straggling refs
* Update .gitignore
* Update server-llm.sh
* main: target name -> llama-cli
* Prefix all example bins w/ llama-
* fix main refs
* rename {main->llama}-cmake-pkg binary
* prefix more cmake targets w/ llama-
* add/fix gbnf-validator subfolder to cmake
* sort cmake example subdirs
* rm bin files
* fix llama-lookup-* Makefile rules
* gitignore /llama-*
* rename Dockerfiles
* rename llama|main -> llama-cli; consistent RPM bin prefixes
* fix some missing -cli suffixes
* rename dockerfile w/ llama-cli
* rename(make): llama-baby-llama
* update dockerfile refs
* more llama-cli(.exe)
* fix test-eval-callback
* rename: llama-cli-cmake-pkg(.exe)
* address gbnf-validator unused fread warning (switched to C++ / ifstream)
* add two missing llama- prefixes
* Updating docs for eval-callback binary to use new `llama-` prefix.
* Updating a few lingering doc references for rename of main to llama-cli
* Updating `run-with-preset.py` to use new binary names.
Updating docs around `perplexity` binary rename.
* Updating documentation references for lookup-merge and export-lora
* Updating two small `main` references missed earlier in the finetune docs.
* Update apps.nix
* update grammar/README.md w/ new llama-* names
* update llama-rpc-server bin name + doc
* Revert "update llama-rpc-server bin name + doc"
This reverts commit e474ef1df4
.
* add hot topic notice to README.md
* Update README.md
* Update README.md
* rename gguf-split & quantize bins refs in **/tests.sh
---------
Co-authored-by: HanClinto <hanclinto@gmail.com>
2024-06-13 00:41:52 +01:00
Sigbjørn Skjæret
8f8acc8683
Disable benchmark on forked repo ( #7034 )
...
* Disable benchmark on forked repo
* only check owner on schedule event
* check owner on push also
* more readable as multi-line
* ternary won't work
* style++
* test++
* enable actions debug
* test--
* remove debug
* test++
* do debug where we can get logs
* test--
* this is driving me crazy
* correct github.event usage
* remove test condition
* correct github.event usage
* test++
* test--
* event_name is pull_request_target
* test++
* test--
* update ref checks
2024-05-05 13:38:55 +02:00
Olivier Chafik
b8a7a5a90f
build(cmake): simplify instructions (cmake -B build && cmake --build build ...
) ( #6964 )
...
* readme: cmake . -B build && cmake --build build
* build: fix typo
Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com>
* build: drop implicit . from cmake config command
* build: remove another superfluous .
* build: update MinGW cmake commands
* Update README-sycl.md
Co-authored-by: Neo Zhang Jianyu <jianyu.zhang@intel.com>
* build: reinstate --config Release as not the default w/ some generators + document how to build Debug
* build: revert more --config Release
* build: nit / remove -H from cmake example
* build: reword debug instructions around single/multi config split
---------
Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com>
Co-authored-by: Neo Zhang Jianyu <jianyu.zhang@intel.com>
2024-04-29 17:02:45 +01:00
Pierrick Hymbert
7d641c26ac
ci: fix concurrency for pull_request_target ( #6917 )
2024-04-26 09:26:59 +02:00
Pierrick Hymbert
c0956b09ba
ci: fix job are cancelling each other ( #6781 )
2024-04-22 13:22:54 +02:00
Pierrick Hymbert
75cd4c7729
ci: bench: support sse and fix prompt processing time / server: add tokens usage in stream OAI response ( #6495 )
...
* ci: bench: support sse and fix prompt processing time
server: add tokens usage in stream mode
* ci: bench: README.md EOL
* ci: bench: remove total pp and tg as it is not accurate
* ci: bench: fix case when there is no token generated
* ci: bench: change to the 95 percentile for pp and tg as it is closer to what the server exports in metrics
* ci: bench: fix finish reason rate
2024-04-06 05:40:47 +02:00
Minsoo Cheong
7dda1b727e
ci: exempt master branch workflows from getting cancelled ( #6486 )
...
* ci: exempt master branch workflows from getting cancelled
* apply to bench.yml
2024-04-04 18:30:53 +02:00
Pierrick Hymbert
8120efee1d
ci: bench fix concurrency for workflow trigger dispatch with sha1 ( #6478 )
2024-04-04 16:59:04 +02:00
Pierrick Hymbert
7a2c92637a
ci: bench: add more ftype, fix triggers and bot comment ( #6466 )
...
* ci: bench: change trigger path to not spawn on each PR
* ci: bench: add more file type for phi-2: q8_0 and f16.
- do not show the comment by default
* ci: bench: add seed parameter in k6 script
* ci: bench: artefact name perf job
* Add iteration in the commit status, reduce again the autocomment
* ci: bench: add per slot metric in the commit status
* Fix trailing spaces
2024-04-04 12:57:58 +03:00
Ewout ter Hoeven
9f62c0173d
ci : update checkout, setup-python and upload-artifact to latest ( #6456 )
...
* CI: Update actions/checkout to v4
* CI: Update actions/setup-python to v5
* CI: Update actions/upload-artifact to v4
2024-04-03 21:01:13 +03:00
Pierrick Hymbert
37e7854c10
ci: bench: fix Resource not accessible by integration on PR event ( #6393 )
2024-03-30 12:36:07 +02:00
Pierrick Hymbert
28cb9a09c4
ci: bench: fix master not schedule, fix commit status failed on external repo ( #6365 )
2024-03-28 11:27:56 +01:00
Pierrick Hymbert
a016026a3a
server: continuous performance monitoring and PR comment ( #6283 )
...
* server: bench: init
* server: bench: reduce list of GPU nodes
* server: bench: fix graph, fix output artifact
* ci: bench: add mermaid in case of image cannot be uploaded
* ci: bench: more resilient, more metrics
* ci: bench: trigger build
* ci: bench: fix duration
* ci: bench: fix typo
* ci: bench: fix mermaid values, markdown generated
* typo on the step name
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
* ci: bench: trailing spaces
* ci: bench: move images in a details section
* ci: bench: reduce bullet point size
---------
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
2024-03-27 20:26:49 +01:00