llama.cpp

root/llama.cpp

Fork 0

mirror of https://github.com/ggerganov/llama.cpp.git synced 2024-12-26 03:14:35 +00:00

Commit Graph

Author	SHA1	Message	Date
Georgi Gerganov	811872a59d	speculative : simplify the implementation (#10504 ) ggml-ci	2024-11-26 12:29:38 +02:00
Diego Devesa	10bce0450f	llama : accept a list of devices to use to offload a model (#10497 ) * llama : accept a list of devices to use to offload a model * accept `--dev none` to completely disable offloading * fix dev list with dl backends * rename env parameter to LLAMA_ARG_DEVICE for consistency	2024-11-25 19:30:06 +01:00
Georgi Gerganov	d9d54e498d	speculative : refactor and add a simpler example (#10362 ) * speculative : refactor and add a simpler example ggml-ci * speculative : clean-up and add comments and TODOs [no ci] * speculative : manage context in common_speculative ggml-ci * speculative : simplify ggml-ci * speculative : simplify (cont) ggml-ci * speculative : add --draft-min CLI arg * speculative : minor fixup * make : build fixes * speculative : do not redraft previous drafts ggml-ci * speculative : fix the draft sampling ggml-ci * speculative : fix compile warning * common : refactor args ggml-ci * common : change defaults [no ci] * common : final touches ggml-ci	2024-11-25 09:58:41 +02:00

Author

SHA1

Message

Date

Georgi Gerganov

811872a59d

speculative : simplify the implementation (#10504 )

ggml-ci

2024-11-26 12:29:38 +02:00

Diego Devesa

10bce0450f

llama : accept a list of devices to use to offload a model (#10497 )

* llama : accept a list of devices to use to offload a model

* accept `--dev none` to completely disable offloading

* fix dev list with dl backends

* rename env parameter to LLAMA_ARG_DEVICE for consistency

2024-11-25 19:30:06 +01:00

Georgi Gerganov

d9d54e498d

speculative : refactor and add a simpler example (#10362 )

* speculative : refactor and add a simpler example

ggml-ci

* speculative : clean-up and add comments and TODOs [no ci]

* speculative : manage context in common_speculative

ggml-ci

* speculative : simplify

ggml-ci

* speculative : simplify (cont)

ggml-ci

* speculative : add --draft-min CLI arg

* speculative : minor fixup

* make : build fixes

* speculative : do not redraft previous drafts

ggml-ci

* speculative : fix the draft sampling

ggml-ci

* speculative : fix compile warning

* common : refactor args

ggml-ci

* common : change defaults [no ci]

* common : final touches

ggml-ci

2024-11-25 09:58:41 +02:00

3 Commits