mirror of
https://github.com/ggerganov/llama.cpp.git
synced 2024-11-14 23:09:53 +00:00
544f409b4b
a return before a barrier (that happens only in some threads in a workgroup) leads to UB. While the old code actually works on some devices, it fails on some others (i.e. "smaller" GPUs). BTW, I think it would be better to set specialization constants when the graph is built, in that way the local workgroup could be sized appropriately. But it would take a lot of work. Signed-off-by: Salvatore Mesoraca <s.mesoraca16@gmail.com> |
||
---|---|---|
.. | ||
cmake | ||
include | ||
src | ||
.gitignore | ||
CMakeLists.txt |