llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2024-12-26 03:14:35 +00:00

History

Georgi Gerganov 9cb317f77e ggml : full ALiBi support (#7192 ) * ggml : full ALiBi support * ggml : update ggml_soft_max_ext() CUDA, SYCL * ggml : ggml_flash_attn_ext() support ALiBi (CPU) * ggml : ggml_flash_attn_ext() support ALiBi (Metal) * ggml : fix warning * ggml : ggml_flash_attn_ext() support ALiBi (CUDA) ggml-ci * ggml : fix assert message * vulkan : add dev notes * ggml : require mask when using ALiBi ggml-ci * convert : fix convert for refact models		2024-05-11 10:32:41 +03:00
..
__init__.py	gguf-py: Refactor and allow reading/modifying existing GGUF files (#3981 )	2023-11-11 08:04:50 +03:00
constants.py	convert-hf : save memory with lazy evaluation (#7075 )	2024-05-08 18:16:38 -04:00
gguf_reader.py	convert-hf : save memory with lazy evaluation (#7075 )	2024-05-08 18:16:38 -04:00
gguf_writer.py	convert-hf : save memory with lazy evaluation (#7075 )	2024-05-08 18:16:38 -04:00
gguf.py	gguf-py: Refactor and allow reading/modifying existing GGUF files (#3981 )	2023-11-11 08:04:50 +03:00
py.typed	convert : various script cleanups/fixes + merges and special token handling (#2842 )	2023-08-30 11:25:50 +03:00
tensor_mapping.py	ggml : full ALiBi support (#7192 )	2024-05-11 10:32:41 +03:00
vocab.py	convert-hf : save memory with lazy evaluation (#7075 )	2024-05-08 18:16:38 -04:00