llama.cpp/gguf-py/gguf
nopperl 9958c81b79
Implement the OLMo architecture (#6741)
* implement olmo architecture

* remove unused variable

* remove unused moe branch

* remove check for weight

* remove superfluous moe, bias and rope tensors

* clarified comment

* fix clamp_kqv setting

* remove obsolete parameter name filter
2024-04-19 11:35:54 +02:00
..
__init__.py gguf-py: Refactor and allow reading/modifying existing GGUF files (#3981) 2023-11-11 08:04:50 +03:00
constants.py Implement the OLMo architecture (#6741) 2024-04-19 11:35:54 +02:00
gguf_reader.py gguf : add support for I64 and F64 arrays (#6062) 2024-03-15 10:46:51 +02:00
gguf_writer.py convert : support models with multiple chat templates (#6588) 2024-04-18 14:49:01 +03:00
gguf.py gguf-py: Refactor and allow reading/modifying existing GGUF files (#3981) 2023-11-11 08:04:50 +03:00
py.typed convert : various script cleanups/fixes + merges and special token handling (#2842) 2023-08-30 11:25:50 +03:00
tensor_mapping.py llama : add qwen2moe (#6074) 2024-04-16 18:40:48 +03:00
vocab.py convert : support models with multiple chat templates (#6588) 2024-04-18 14:49:01 +03:00