Chen Xi
b549a1bbef
[SYCL] fix the mul_mat_id ut issues ( #8427 )
...
* fix part of mul_mat_id
* skip the bfloat 16 sycl ut
Signed-off-by: Chen Xi <xi2chen@intel.com>
---------
Signed-off-by: Chen Xi <xi2chen@intel.com>
Co-authored-by: Meng, Hengyu <hengyu.meng@intel.com>
Co-authored-by: Chen Xi <xi2chen@intel.com>
2024-07-12 08:52:04 +08:00
Alberto Cabrera Pérez
5b0b8d8cfb
sycl : Reenabled mmvq path for the SYCL Nvidia Backend ( #8372 )
...
* SYCL : Reenabled mmvq path for the SYCL Nvidia Backend
* Reduced verbosity of comment
2024-07-09 22:03:15 +08:00
Ouadie EL FAROUKI
1f3e1b66e2
Enabled more data types for oneMKL gemm_batch ( #8236 )
2024-07-05 13:23:25 +01:00
luoyu-intel
a9554e20b6
[SYCL] Fix WARP_SIZE=16 bug of Intel GPU ( #8266 )
...
* fix group_norm ut
* split softmax
* fix softmax
* add concat support condition
* revert debug code
* move QK_WARP_SIZE to presets.hpp
2024-07-05 13:06:13 +08:00
Neo Zhang Jianyu
f09b7cb609
rm get_work_group_size() by local cache for performance ( #8286 )
...
Co-authored-by: arthw <14088817+arthw@users.noreply.github.com>
2024-07-05 10:32:29 +08:00
luoyu-intel
d08c20edde
[SYCL] Fix the sub group size of Intel ( #8106 )
...
* use warp_size macro for all sycl kernels
* fix mask of permute_sub_group_by_xor
* fix rms_norm with correct warp number
* fix rms_norm_f32/group_norm_f32
* move norm to norm.cpp file
* fix quantize bug
* fix mmvq's batch size
2024-07-02 10:16:00 +08:00
zhentaoyu
197fe6c1d7
[SYCL] Update SYCL-Rope op and Refactor ( #8157 )
...
* align with rope.cu and move sycl-op to a single file
2024-07-01 19:39:06 +08:00
Georgi Gerganov
f3f65429c4
llama : reorganize source code + improve CMake ( #8006 )
...
* scripts : update sync [no ci]
* files : relocate [no ci]
* ci : disable kompute build [no ci]
* cmake : fixes [no ci]
* server : fix mingw build
ggml-ci
* cmake : minor [no ci]
* cmake : link math library [no ci]
* cmake : build normal ggml library (not object library) [no ci]
* cmake : fix kompute build
ggml-ci
* make,cmake : fix LLAMA_CUDA + replace GGML_CDEF_PRIVATE
ggml-ci
* move public backend headers to the public include directory (#8122 )
* move public backend headers to the public include directory
* nix test
* spm : fix metal header
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* scripts : fix sync paths [no ci]
* scripts : sync ggml-blas.h [no ci]
---------
Co-authored-by: slaren <slarengh@gmail.com>
2024-06-26 18:33:02 +03:00