llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2024-11-14 23:09:53 +00:00

History

Jeff Bolz 80dd7ff22f vulkan: Optimize contiguous copies (#10254 ) * tests: Fix memory bandwidth calculation for perf tests Add a flops calculation for flash attention. Add one GGML_OP_CPY perf test. * vulkan: Optimize contiguous copies Add a variant of the copy shader for when the tensors are contiguous. Avoid the complex addressing calculations, and do four elements per invocation to hide some other overhead. Apply similar changes to the scale shader, since scale is always contiguous. Add a "progress bar" for shader compiles.		2024-11-13 07:58:57 +01:00
..
.gitignore	tests : gitignore ggml-common.h	2024-03-09 14:17:11 +02:00
CMakeLists.txt	threadpool : skip polling for unused threads (#9461 )	2024-09-17 11:19:46 +03:00
get-model.cpp	ci : add model tests + script wrapper (#4586 )	2024-01-26 14:18:00 +02:00
get-model.h	ci : add model tests + script wrapper (#4586 )	2024-01-26 14:18:00 +02:00
run-json-schema-to-grammar.mjs	server : revamp chat UI with vuejs and daisyui (#10175 )	2024-11-07 17:31:10 -04:00
test-arg-parser.cpp	common : use common_ prefix for common library functions (#9805 )	2024-10-10 22:57:42 +02:00
test-autorelease.cpp	ggml : add numa options (#5377 )	2024-02-16 11:31:07 +02:00
test-backend-ops.cpp	vulkan: Optimize contiguous copies (#10254 )	2024-11-13 07:58:57 +01:00
test-barrier.cpp	ggml : move CPU backend to a separate file (#10144 )	2024-11-03 19:34:08 +01:00
test-c.c	Nomic Vulkan backend (#4456 )	2024-01-29 15:50:50 -05:00
test-chat-template.cpp	llama : Add IBM granite template (#10013 )	2024-10-28 18:45:33 +01:00
test-double-float.cpp	ggml : minor naming changes (#8433 )	2024-07-12 10:46:02 +03:00
test-grad0.cpp	ggml : move CPU backend to a separate file (#10144 )	2024-11-03 19:34:08 +01:00
test-grammar-integration.cpp	llama : refactor sampling v2 (#9294 )	2024-09-07 15:16:19 +03:00
test-grammar-parser.cpp	llama : refactor sampling v2 (#9294 )	2024-09-07 15:16:19 +03:00
test-json-schema-to-grammar.cpp	grammar : fix JSON Schema for string regex with top-level alt. (#9903 )	2024-10-16 19:03:24 +03:00
test-llama-grammar.cpp	llama : refactor sampling v2 (#9294 )	2024-09-07 15:16:19 +03:00
test-log.cpp	common : use common_ prefix for common library functions (#9805 )	2024-10-10 22:57:42 +02:00
test-lora-conversion-inference.sh	lora : fix llama conversion script with ROPE_FREQS (#9117 )	2024-08-23 12:58:53 +02:00
test-model-load-cancel.cpp	ggml : add numa options (#5377 )	2024-02-16 11:31:07 +02:00
test-opt.cpp	code : normalize enum names (#5697 )	2024-02-25 12:09:09 +02:00
test-quantize-fns.cpp	ggml : move CPU backend to a separate file (#10144 )	2024-11-03 19:34:08 +01:00
test-quantize-perf.cpp	ggml : move CPU backend to a separate file (#10144 )	2024-11-03 19:34:08 +01:00
test-rope.cpp	ggml : move CPU backend to a separate file (#10144 )	2024-11-03 19:34:08 +01:00
test-sampling.cpp	llama : remove Tail-Free sampling (#10071 )	2024-10-29 10:42:05 +02:00
test-tokenizer-0.cpp	common : use common_ prefix for common library functions (#9805 )	2024-10-10 22:57:42 +02:00
test-tokenizer-0.py	py : logging and flake8 suppression refactoring (#7081 )	2024-05-05 08:07:48 +03:00
test-tokenizer-0.sh	tests : fix test-tokenizer-0.sh	2024-05-28 15:04:09 +03:00
test-tokenizer-1-bpe.cpp	common : use common_ prefix for common library functions (#9805 )	2024-10-10 22:57:42 +02:00
test-tokenizer-1-spm.cpp	common : use common_ prefix for common library functions (#9805 )	2024-10-10 22:57:42 +02:00
test-tokenizer-random.py	llama : fix pre-tokenization of non-special added tokens (#8228 )	2024-07-13 23:35:10 -04:00