Commit Graph

8 Commits

Author SHA1 Message Date
Pierrick Hymbert
75cd4c7729
ci: bench: support sse and fix prompt processing time / server: add tokens usage in stream OAI response (#6495)
* ci: bench: support sse and fix prompt processing time
server: add tokens usage in stream mode

* ci: bench: README.md EOL

* ci: bench: remove total pp and tg as it is not accurate

* ci: bench: fix case when there is no token generated

* ci: bench: change to the 95 percentile for pp and tg as it is closer to what the server exports in metrics

* ci: bench: fix finish reason rate
2024-04-06 05:40:47 +02:00
Minsoo Cheong
7dda1b727e
ci: exempt master branch workflows from getting cancelled (#6486)
* ci: exempt master branch workflows from getting cancelled

* apply to bench.yml
2024-04-04 18:30:53 +02:00
Pierrick Hymbert
8120efee1d
ci: bench fix concurrency for workflow trigger dispatch with sha1 (#6478) 2024-04-04 16:59:04 +02:00
Pierrick Hymbert
7a2c92637a
ci: bench: add more ftype, fix triggers and bot comment (#6466)
* ci: bench: change trigger path to not spawn on each PR

* ci: bench: add more file type for phi-2: q8_0 and f16.
- do not show the comment by default

* ci: bench: add seed parameter in k6 script

* ci: bench: artefact name perf job

* Add iteration in the commit status, reduce again the autocomment

* ci: bench: add per slot metric in the commit status

* Fix trailing spaces
2024-04-04 12:57:58 +03:00
Ewout ter Hoeven
9f62c0173d
ci : update checkout, setup-python and upload-artifact to latest (#6456)
* CI: Update actions/checkout to v4

* CI: Update actions/setup-python to v5

* CI: Update actions/upload-artifact to v4
2024-04-03 21:01:13 +03:00
Pierrick Hymbert
37e7854c10
ci: bench: fix Resource not accessible by integration on PR event (#6393) 2024-03-30 12:36:07 +02:00
Pierrick Hymbert
28cb9a09c4
ci: bench: fix master not schedule, fix commit status failed on external repo (#6365) 2024-03-28 11:27:56 +01:00
Pierrick Hymbert
a016026a3a
server: continuous performance monitoring and PR comment (#6283)
* server: bench: init

* server: bench: reduce list of GPU nodes

* server: bench: fix graph, fix output artifact

* ci: bench: add mermaid in case of image cannot be uploaded

* ci: bench: more resilient, more metrics

* ci: bench: trigger build

* ci: bench: fix duration

* ci: bench: fix typo

* ci: bench: fix mermaid values, markdown generated

* typo on the step name

Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>

* ci: bench: trailing spaces

* ci: bench: move images in a details section

* ci: bench: reduce bullet point size

---------

Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
2024-03-27 20:26:49 +01:00