llama.cpp/tests
Abhilash Majumder 0f648573dd
ggml : add unified SYCL backend for Intel GPUs (#2690)
* first update for migration

* update init_cublas

* add debug functio, commit all help code

* step 1

* step 2

* step3 add fp16, slower 31->28

* add GGML_LIST_DEVICE function

* step 5 format device and print

* step6, enhance error check, remove CUDA macro, enhance device id to fix none-zero id issue

* support main device is non-zero

* step7 add debug for code path, rm log

* step 8, rename all macro & func from cuda by sycl

* fix error of select non-zero device, format device list

* ren ggml-sycl.hpp -> ggml-sycl.h

* clear CMAKE to rm unused lib and options

* correct queue: rm dtct:get_queue

* add print tensor function to debug

* fix error: wrong result in 658746bb26702e50f2c59c0e4ada8e9da6010481

* summary dpct definition in one header file to replace folder:dpct

* refactor device log

* mv dpct definition from folder dpct to ggml-sycl.h

* update readme, refactor build script

* fix build with sycl

* set nthread=1 when sycl, increase performance

* add run script, comment debug code

* add ls-sycl-device tool

* add ls-sycl-device, rm unused files

* rm rear space

* dos2unix

* Update README_sycl.md

* fix return type

* remove sycl version from include path

* restore rm code to fix hang issue

* add syc and link for sycl readme

* rm original sycl code before refactor

* fix code err

* add know issue for pvc hang issue

* enable SYCL_F16 support

* align pr4766

* check for sycl blas, better performance

* cleanup 1

* remove extra endif

* add build&run script, clean CMakefile, update guide by review comments

* rename macro to intel hardware

* editor config format

* format fixes

* format fixes

* editor format fix

* Remove unused headers

* skip build sycl tool for other code path

* replace tab by space

* fix blas matmul function

* fix mac build

* restore hip dependency

* fix conflict

* ren as review comments

* mv internal function to .cpp file

* export funciton print_sycl_devices(), mv class dpct definition to source file

* update CI/action for sycl code, fix CI error of repeat/dup

* fix action ID format issue

* rm unused strategy

* enable llama_f16 in ci

* fix conflict

* fix build break on MacOS, due to CI of MacOS depend on external ggml, instead of internal ggml

* fix ci cases for unsupported data type

* revert unrelated changed in cuda cmake
remove useless nommq
fix typo of GGML_USE_CLBLAS_SYCL

* revert hip cmake changes

* fix indent

* add prefix in func name

* revert no mmq

* rm cpu blas duplicate

* fix no_new_line

* fix src1->type==F16 bug.

* pass batch offset for F16 src1

* fix batch error

* fix wrong code

* revert sycl checking in test-sampling

* pass void as arguments of ggml_backend_sycl_print_sycl_devices

* remove extra blank line in test-sampling

* revert setting n_threads in sycl

* implement std::isinf for icpx with fast math.

* Update ci/run.sh

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* Update examples/sycl/run-llama2.sh

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* Update examples/sycl/run-llama2.sh

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* Update CMakeLists.txt

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* Update CMakeLists.txt

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* Update CMakeLists.txt

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* Update CMakeLists.txt

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* add copyright and MIT license declare

* update the cmd example

---------

Co-authored-by: jianyuzh <jianyu.zhang@intel.com>
Co-authored-by: luoyu-intel <yu.luo@intel.com>
Co-authored-by: Meng, Hengyu <hengyu.meng@intel.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-01-28 17:56:23 +02:00
..
.gitignore tests : gitignore test-c.o 2024-01-26 14:48:15 +02:00
CMakeLists.txt ci : add model tests + script wrapper (#4586) 2024-01-26 14:18:00 +02:00
get-model.cpp ci : add model tests + script wrapper (#4586) 2024-01-26 14:18:00 +02:00
get-model.h ci : add model tests + script wrapper (#4586) 2024-01-26 14:18:00 +02:00
test-autorelease.cpp ci : add model tests + script wrapper (#4586) 2024-01-26 14:18:00 +02:00
test-backend-ops.cpp ggml : add unified SYCL backend for Intel GPUs (#2690) 2024-01-28 17:56:23 +02:00
test-c.c tests : add a C compliance test (#2848) 2023-08-30 09:20:26 +03:00
test-double-float.cpp ggml : move FP16 <-> FP32 code to ggml-impl.h (#3861) 2023-10-30 19:19:15 +02:00
test-grad0.cpp cuda : improve cuda pool efficiency using virtual memory (#4606) 2023-12-24 14:34:22 +01:00
test-grammar-parser.cpp gguf : new file format with flexible meta data (beta) (#2398) 2023-08-21 23:07:43 +03:00
test-llama-grammar.cpp Remove unused data and add fixes (#5154) 2024-01-27 15:25:55 +01:00
test-model-load-cancel.cpp ci : add model tests + script wrapper (#4586) 2024-01-26 14:18:00 +02:00
test-opt.cpp sync : ggml (backend v2) (#3912) 2023-11-13 14:16:23 +02:00
test-quantize-fns.cpp ggml : SOTA 2-bit quants (add IQ2_XS) (#4856) 2024-01-11 21:39:39 +02:00
test-quantize-perf.cpp ggml : use ggml_row_size where possible (#4472) 2023-12-14 20:05:21 +01:00
test-rope.cpp llama : custom attention mask + parallel decoding + no context swaps (#3228) 2023-09-28 19:04:36 +03:00
test-sampling.cpp Tests for min_p, sampling queue (#5147) 2024-01-28 09:35:14 +01:00
test-tokenizer-0-falcon.cpp Minor improvements in GPT2 tokenizer (#3567) 2023-10-10 18:59:52 +02:00
test-tokenizer-0-falcon.py ci : add flake8 to github actions (python linting) (#4129) 2023-11-20 11:35:47 +01:00
test-tokenizer-0-llama.cpp Minor improvements in GPT2 tokenizer (#3567) 2023-10-10 18:59:52 +02:00
test-tokenizer-0-llama.py ci : add flake8 to github actions (python linting) (#4129) 2023-11-20 11:35:47 +01:00
test-tokenizer-1-bpe.cpp Add more tokenizer tests (#3742) 2023-10-24 09:17:17 +02:00
test-tokenizer-1-llama.cpp Work on the BPE tokenizer (#3252) 2023-10-03 09:16:26 +02:00