llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2024-12-29 04:44:34 +00:00

Author	SHA1	Message	Date
Chenguang Li	938f608742	CANN: RoPE operator optimization (#10563 ) * [cann] RoPE operator optimization * [CANN]Code Formatting --------- Co-authored-by: noemotiovon <noemotiovon@gmail.com>	2024-11-29 14:46:55 +08:00
Chenguang Li	b7420131bf	CANN: ROPE operator optimization (#10540 ) * [cann] ROPE operator optimization Co-authored-by: noemotiovon <noemotiovon@gmail.com>	2024-11-28 14:24:46 +08:00
Shanshan Shen	9a4b79bcfa	CANN: Improve the Inferencing Performance for Ascend NPU Device (#10454 ) * improve inferencing performance for ascend npu. Co-authored-by: Frank Mai <thxCode@thxcode0824@gmail.com> * some modification after review * some modifications after review * restore some modifications * restore some modifications --------- Co-authored-by: shanshan shen <shanshanshen333@gmail.com> Co-authored-by: Frank Mai <thxCode@thxcode0824@gmail.com>	2024-11-26 18:08:37 +08:00
Chenguang Li	7066b4cce2	CANN: RoPE and CANCAT operator optimization (#10488 ) Co-authored-by: noemotiovon <noemotiovon@gmail.com>	2024-11-26 17:31:05 +08:00
leo-pony	c18610b4ee	CANN: Support Ascend310P to accelerate F32 and F16 Model (#10216 ) * CANN Support Ascend310P to accelerate F32 and F16 Model * Add compile option soc type macro ASCEND_310P to ggml-cann lib * Remove unused code * Remove the ascend soc_type hard code compile option in CMakelist.txt	2024-11-22 14:07:20 +08:00
Daniel Bevenius	06943a69f6	ggml : move rope type enum to ggml.h (#8949 ) * ggml : move rope type enum to ggml.h This commit moves the `llama_rope_type` enum from `llama.h` to `ggml.h` and changes its name to `ggml_rope_type`. The motivation for this change is to address the TODO in `llama.h` and use the enum in ggml. Note: This commit does not change the `mode` parameter to be of type `enum ggml_rope_type`. The name `mode` and its usage suggest that it might be more generic and possibly used as a bit field for multiple flags. Further investigation/discussion may be needed to determine if `mode` should be restricted to RoPE types. * squash! ggml : move rope type enum to ggml.h This commit removes GGML_ROPE_TYPE_NONE and GGML_ROPE_TYPE_GLM from ggml.h, and back the llama_rope_type enum. I've kept the assert for GGML_ROPE_TYPE_GLM as I'm not sure if it is safe to remove it yet. * squash! ggml : move rope type enum to ggml.h This commit removes the enum ggml_rope_type from ggml.h and replaces it with a define (GGML_ROPE_TYPE_NEOX). This define is used in the code to check if the mode is set to GPT-NeoX. Also the enum llama_rope_type has been updated to reflect this change. * squash! ggml : move rope type enum to ggml.h This commit contains a suggestion enable the GGML_ROPE_TYPE_NEOX macro/define to be passed to the shader compiler. * squash! ggml : move rope type enum to ggml.h This commit fixes the editorconfig-checker warnings. * squash! ggml : move rope type enum to ggml.h Update comment for ggml_rope function. * Revert "squash! ggml : move rope type enum to ggml.h" This reverts commit `6261222bd0`. * squash! ggml : move rope type enum to ggml.h Add GGML_ROPE_TYPE_NEOX to rope_common.comp. * remove extra line --------- Co-authored-by: slaren <slarengh@gmail.com>	2024-08-13 21:13:15 +02:00
Molly Sophia	2d5dd7bb3f	ggml : add epsilon as a parameter for group_norm (#8818 ) Signed-off-by: Molly Sophia <mollysophia379@gmail.com>	2024-08-06 10:26:46 +03:00
wangshuai09	c02b0a8a4d	cann: support q4_0 model (#8822 )	2024-08-05 12:22:30 +08:00
Mengqing Cao	e09a800f9a	cann: Fix ggml_cann_im2col for 1D im2col (#8819 ) * fix ggml_cann_im2col for 1D im2col * fix build warning	2024-08-02 16:50:53 +08:00
wangshuai09	c8a0090922	cann: support q8_0 for Ascend backend (#8805 )	2024-08-01 10:39:05 +08:00
slaren	2b1f616b20	ggml : reduce hash table reset cost (#8698 ) * ggml : reduce hash table reset cost * fix unreachable code warnings after GGML_ASSERT(false) * GGML_ASSERT(false) -> GGML_ABORT("fatal error") * GGML_ABORT use format string	2024-07-27 04:41:55 +02:00
hipudding	1bdd8ae19f	[CANN] Add Ascend NPU backend (#6035 ) * [CANN] Add Ascend NPU backend Ascend is a full-stack AI computing infrastructure for industry applications and services based on Huawei Ascend processors and software. CANN (Compute Architecture of Neural Networks), developped by Huawei, is a heterogeneous computing architecture for AI. Co-authored-by: wangshuai09 <391746016@qq.com> * delete trailing whitespaces * Modify the code based on review comment * Rename LLAMA_CANN to GGML_CANN * Make ggml-common.h private * add ggml_cann prefix for acl funcs * Add logging for CANN backend * Delete Trailing whitespace --------- Co-authored-by: wangshuai09 <391746016@qq.com>	2024-07-17 14:23:50 +03:00

12 Commits