llama.cpp/gguf-py/gguf
hxer7963 069574775c
[Model] Add support for xverse (#6301)
* Support xverse model convert to gguf format.

* 1. Convert xverse models to gguf;
2. Add LLM_ARCH_XVERSE inference in llama.cpp;
3. Add xverse item in Supported models in README.md;

* * gguf-py: remove redundant logs
* llama: remove the init_mapping_prefetch custom parameter

* llama.cpp: Include the changes from #6122 to exclude the unused outputs of the last layers.

* - Fix format issues
- Remove duplicate set kqv_out to llm_build_kv

* Update llama.cpp

---------

Co-authored-by: willhe <willhe@xverse.cn>
Co-authored-by: willhe <hexin@xverse.cn>
2024-03-29 14:37:03 +01:00
..
__init__.py gguf-py: Refactor and allow reading/modifying existing GGUF files (#3981) 2023-11-11 08:04:50 +03:00
constants.py [Model] Add support for xverse (#6301) 2024-03-29 14:37:03 +01:00
gguf_reader.py gguf : add support for I64 and F64 arrays (#6062) 2024-03-15 10:46:51 +02:00
gguf_writer.py llama : add Command-R support (#6033) 2024-03-15 22:41:22 +02:00
gguf.py gguf-py: Refactor and allow reading/modifying existing GGUF files (#3981) 2023-11-11 08:04:50 +03:00
py.typed convert : various script cleanups/fixes + merges and special token handling (#2842) 2023-08-30 11:25:50 +03:00
tensor_mapping.py llama : add grok-1 support (#6204) 2024-03-23 18:41:53 +02:00
vocab.py fix(gguf-py): special tokens are no longer skipped when add_<token>_token is set to false (#5487) 2024-02-15 14:14:37 +01:00