llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2024-12-26 03:14:35 +00:00

History

HimariO ba1cb19cdd Some checks failed flake8 Lint / Lint (push) Waiting to run Details Python Type-Check / pyright type-check (push) Waiting to run Details Python check requirements.txt / check-requirements (push) Has been cancelled Details llama : add Qwen2VL support + multimodal RoPE (#10361 ) * Barebone Qwen2VL LLM convertor * Add Qwen2VL cli entrypoint * [WIP] add qwen2vl arch * Verify m-rope output * Add vl-rope/2d-rope support for qwen2vl ViT * update qwen2vl cli tool * update 5D tensor op workaround * [WIP] qwen2vl vision model * make batch and clip utils compatible with qwen2vl * [WIP] create inference workflow, gguf convert script but fix * correcting vision-rope behavior, add the missing last layer back to ViT * add arg parser to qwen2vl_surgery * replace variable size array with vector * cuda-gdb cmake preset * add fp32 mrope, vision rope kernel * add fp16 support for qwen2vl and m-rope * add `GGML_ROPE_TYPE_MROPE`, `GGML_ROPE_TYPE_VISION` * fix rope op mode switching, out dated func args * update `llama_hparams` * update to keep up stream changes * resolve linter, test errors * add makefile entry, update speical image padding token * add mrope unit test, fix few compiler warnings * rename `mrope` related function, params * minor updates on debug util, bug fixs * add `m-rope` testcase to `test-backend-ops` * Apply suggestions from code review Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * fix traililng whitespce * store `llama_hparams.rope_sections` with fixed size array * update position id tensor size check in GGML_OP_ROPE * minor updates * update `ggml_backend__supports_op` of unsupported backends remote old `rope_section` compare operator --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>		2024-12-14 14:43:46 +02:00
..
CMakeLists.txt	remove CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS (#10797 )	2024-12-12 19:02:49 +01:00
llama-grammar.cpp	llama : refactor sampling v2 (#9294 )	2024-09-07 15:16:19 +03:00
llama-grammar.h	llama : refactor sampling v2 (#9294 )	2024-09-07 15:16:19 +03:00
llama-impl.h	log : add CONT level for continuing previous log entry (#9610 )	2024-09-24 10:15:35 +03:00
llama-sampling.cpp	DRY: Fixes clone functionality (#10192 )	2024-11-07 16:20:25 +01:00
llama-sampling.h	llama : add DRY sampler (#9702 )	2024-10-25 19:07:34 +03:00
llama-vocab.cpp	llama : add Minerva 7B model support (#10673 )	2024-12-05 20:30:59 +02:00
llama-vocab.h	llama : add DRY sampler (#9702 )	2024-10-25 19:07:34 +03:00
llama.cpp	llama : add Qwen2VL support + multimodal RoPE (#10361 )	2024-12-14 14:43:46 +02:00
unicode-data.cpp	server : better security control for public deployments (#9776 )	2024-10-08 13:27:04 +02:00
unicode-data.h	llama : reduce compile time and binary size (#9712 )	2024-10-02 15:49:55 +02:00
unicode.cpp	ggml : move AMX to the CPU backend (#10570 )	2024-11-29 21:54:58 +01:00
unicode.h	llama : move vocab, grammar and sampling into separate files (#8508 )	2024-07-23 13:10:17 +03:00