llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2024-12-26 11:24:35 +00:00

Author	SHA1	Message	Date
Pierrick Hymbert	4bd0f93e4a	model: support arch `DbrxForCausalLM` (#6515 ) * model: dbrx convert to gguf #6344 * llama: support dbrx #6344 * doc: dbrx: add the model as supported * scripts: get-wikitext-2 add unzip * llama: increase maximum experts allowed * llama: factorize moe graph implementation between grok, mixtral and dbrx --------- Co-authored-by: Megha Agarwal <16129366+megha95@users.noreply.github.com>	2024-04-13 11:33:52 +02:00
Daniel Bevenius	f4183afe6a	scripts : add --outdir option to hf.sh (#6600 ) * scripts : add --outdir option to hf.sh This commit adds an option to the hf.sh script that allows the user to specify an output directory for the downloaded file. The motivation for this changes is that examples that use the hf.sh script to download models from huggingface can now specify the output directory, perhaps to the `models` directory to keep them in one place and not clutter the root directory. Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com> * squash! scripts : add --outdir option to hf.sh Fix format of the --outdir option in the usage message. Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com> --------- Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>	2024-04-11 16:22:47 +03:00
Georgi Gerganov	c4a3a4ff47	sync : ggml	2024-04-09 20:29:06 +03:00
Georgi Gerganov	e11a8999b5	license : update copyright notice + add AUTHORS (#6405 ) * license : add AUTHORS * authors : update * scipts : add LICENSE and gen-authors.sh to sync	2024-04-09 09:23:19 +03:00
Georgi Gerganov	c37247796b	sync : ggml	2024-04-07 17:05:51 +03:00
Georgi Gerganov	43e8995e75	scripts : sync ggml-cuda folder	2024-04-07 16:08:12 +03:00
Georgi Gerganov	54ea0698fb	sync : ggml	2024-04-06 18:27:46 +03:00
Johannes Gäßler	33a5244806	compare-llama-bench.py: fix long hexsha args (#6424 )	2024-04-01 13:30:43 +02:00
Georgi Gerganov	d48ccf3ad4	sync : ggml (#6351 ) * sync : ggml ggml-ci * cuda : move GGML_CUDA_DMMV constants to dmmv.cuh --------- Co-authored-by: slaren <slarengh@gmail.com>	2024-03-29 17:45:46 +02:00
slaren	280345968d	cuda : rename build flag to LLAMA_CUDA (#6299 )	2024-03-26 01:16:01 +01:00
Johannes Gäßler	50ccaf5eac	lookup: complement data from context with general text statistics (#5479 ) * lookup: evaluation tools, use corpus/previous gens * fixup! lookup: evaluation tools, use corpus/previous gens * fixup! lookup: evaluation tools, use corpus/previous gens * fixup! lookup: evaluation tools, use corpus/previous gens * fixup! lookup: evaluation tools, use corpus/previous gens	2024-03-23 01:24:36 +01:00
Georgi Gerganov	b838b53ad6	sync : ggml	2024-03-10 20:10:46 +02:00
Georgi Gerganov	8a3012a4ad	ggml : add ggml-common.h to deduplicate shared code (#5940 ) * ggml : add ggml-common.h to shared code ggml-ci * scripts : update sync scripts * sycl : reuse quantum tables ggml-ci * ggml : minor * ggml : minor * sycl : try to fix build	2024-03-09 12:47:57 +02:00
slaren	652ca2bded	compare-llama-bench.py : remove mul_mat_q (#5892 )	2024-03-05 22:27:29 +01:00
Georgi Gerganov	efd8533ef8	sync : ggml ggml-ci	2024-03-04 20:54:23 +02:00
Georgi Gerganov	a0fc62661f	sync : ggml	2024-03-04 10:40:04 +02:00
Georgi Gerganov	ef2cd694c4	scripts : add pod-llama.sh	2024-03-02 16:54:20 +02:00
Pierrick Hymbert	3ab8b3a92e	llama : cleanup unused mmq flags (#5772 ) * cleanup unused --no-mul-mat-q,-nommq, -mmq, --mul-mat-q, mul_mat_q * remove: mul_mat_q in compare llama bench and usage * update llama-bench --------- Co-authored-by: slaren <slarengh@gmail.com>	2024-03-01 13:39:06 +02:00
Georgi Gerganov	8c0e8f4e73	sync : ggml	2024-02-28 11:17:32 +02:00
Georgi Gerganov	334f76fa38	sync : ggml	2024-02-22 23:21:05 +02:00
Georgi Gerganov	5022cf242d	sync : ggml	2024-02-21 16:52:52 +02:00
Georgi Gerganov	eccd7a26dd	sync : ggml (#5633 ) * ggml : fix conv_2d batch mode (ggml/737) Co-authored-by: bssrdf <bssrdf@gmail.com> * ggml : compute forward no longer pass src tensors (ggml/729) * sync : ggml ggml-ci --------- Co-authored-by: bssrdf <merlintiger@hotmail.com> Co-authored-by: bssrdf <bssrdf@gmail.com>	2024-02-21 16:17:10 +02:00
Georgi Gerganov	337c9cbd52	sync : ggml ggml-ci	2024-02-19 15:09:43 +02:00
Jared Van Bortel	a0c2dad9d4	build : pass all warning flags to nvcc via -Xcompiler (#5570 ) * build : pass all warning flags to nvcc via -Xcompiler * make : fix apparent mis-merge from #3952 * make : fix incorrect GF_CC_VER for CUDA host compiler	2024-02-18 16:21:52 -05:00
Georgi Gerganov	b1de96824b	ci : fix wikitext url + compile warnings (#5569 ) ggml-ci	2024-02-18 22:39:30 +02:00
Georgi Gerganov	d2819d5577	scripts : add helpers script for bench comparing commits (#5521 ) * scripts : add helpers script for bench comparing commits * scripts : detect CUDA * set flags after checking the command line * fix make flags --------- Co-authored-by: slaren <slarengh@gmail.com>	2024-02-16 15:14:40 +02:00
Georgi Gerganov	9350a1cf21	scripts : add hf.sh helper script (#5501 ) * scripts : add hf.sh helper scripts * hf : add error logs * hf : add support for --repo and --file	2024-02-15 15:41:15 +02:00
Georgi Gerganov	3b169441df	sync : ggml (#5452 ) * ggml-alloc : v3 (ggml/727) * ggml-alloc v3 ggml-ci * fix ci ggml-ci * whisper : check for backend buffer allocation failures * whisper : avoid leaks when initialization fails * cleanup ggml-ci * style fixes ggml-ci * sync : ggml * update llama.cpp, clip.cpp, export-lora.cpp * update finetune.cpp, train-text-from-scratch.cpp ggml-ci * ggml-backend : reduce alignment to 32 to match gguf and fix mmap --------- Co-authored-by: slaren <slarengh@gmail.com>	2024-02-12 09:16:06 +02:00
Georgi Gerganov	cd9aea63b5	scripts : update sync scripts with new backends	2024-02-10 09:53:05 +02:00
Georgi Gerganov	43b65f5eb8	sync : ggml	2024-02-10 09:30:36 +02:00
Georgi Gerganov	30679d438d	scripts : fix typos, cleanup (#5303 )	2024-02-05 09:48:03 +02:00
Нияз Гарифзянов	4be04c8965	scripts : add non-interactive server-llm.sh (#5303 ) * Update server-llm.sh Add flag --non-interactive that allows run script without asking a permission * Update scripts/server-llm.sh --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2024-02-05 09:43:57 +02:00
Georgi Gerganov	e437b37fd0	scripts : parse wtype in server-llm.sh (#5167 ) * scripts : parse wtype in server-llm.sh * scripts : fix check for wfile	2024-02-02 14:23:40 +02:00
Neo Zhang Jianyu	01684139c3	support SYCL backend windows build (#5208 ) * support SYCL backend windows build * add windows build in CI * add for win build CI * correct install oneMKL * fix install issue * fix ci * fix install cmd * fix install cmd * fix install cmd * fix install cmd * fix install cmd * fix win build * fix win build * fix win build * restore other CI part * restore as base * rm no new line * fix no new line issue, add -j * fix grammer issue * allow to trigger manually, fix format issue * fix format * add newline * fix format * fix format * fix format issuse --------- Co-authored-by: Abhilash Majumder <30946547+abhilash1910@users.noreply.github.com>	2024-01-31 08:08:07 +05:30
Georgi Gerganov	8f8ddfcfad	sync : ggml (#0 )	2024-01-30 16:21:57 +02:00
Georgi Gerganov	35dec26cc2	sync : ggml	2024-01-28 19:48:05 +02:00
Georgi Gerganov	753eafed0e	sync : ggml	2024-01-27 17:00:24 +02:00
Georgi Gerganov	5f1925a8ce	scripts : move run-with-preset.py from root to scripts folder	2024-01-26 17:09:44 +02:00
crasm	413e7b0559	ci : add model tests + script wrapper (#4586 ) * scripts : add lib.sh and lib_test.sh * scripts : stub out new ci-run.sh script * scripts : switch to PascalCase for functions This looks a little odd at first, but I find it very useful as a convention to know if a command is part of our code vs a builtin. * scripts : add some fancy conversion from snake_case to PascalCase * Add venv to ci/run.sh * Revert scripts work * scripts : add wrapper script for local use of ci/run.sh * Simplify .gitignore for tests, clang-tidy fixes * Label all ctest tests * ci : ctest uses -L main * Attempt at writing ctest_with_model * Update test-model-load-cancel * ci : add ctest_with_model for debug and release ggml-ci * Fix gg_get_model function ggml-ci * got stuck on CMake * Add get_model.cpp to tests/CMakeLists.txt ggml-ci * Fix README.md output for ctest_with_model ggml-ci * workflows : use `-L main` for all ctest ggml-ci * Fixes * GG_RUN_CTEST_MODELFILE => LLAMACPP_TESTMODELFILE * Always show warning rather than failing if model file variable is not set * scripts : update usage text for ci-run.sh	2024-01-26 14:18:00 +02:00
Georgi Gerganov	e9240cdfa0	scripts : add get-winogrande.sh	2024-01-18 20:45:39 +02:00
Georgi Gerganov	dcad445d0c	scritps : add helper script to get hellaswag data in txt format	2024-01-18 11:44:49 +02:00
Georgi Gerganov	6b6916b215	sync : ggml	2024-01-17 20:54:50 +02:00
Georgi Gerganov	9408cfdad6	scripts : sync-ggml-am.sh option to skip commits	2024-01-14 11:08:41 +02:00
Georgi Gerganov	76484fbfd3	sync : ggml	2024-01-14 00:14:46 +02:00
Johannes Gäßler	7dc78764e2	compare-llama-bench: tweak output format (#4910 )	2024-01-13 15:52:53 +01:00
Georgi Gerganov	de473f5f8e	sync : ggml	2024-01-12 22:02:43 +02:00
Georgi Gerganov	64802ec00d	sync : ggml	2024-01-11 09:39:08 +02:00
Johannes Gäßler	4f56458d34	Python script to compare commits with llama-bench (#4844 )	2024-01-10 01:04:33 +01:00
Georgi Gerganov	9a818f7c42	scripts : improve get-pg.sh (#4838 )	2024-01-09 19:21:13 +02:00
Georgi Gerganov	d9653894df	scripts : script to get Paul Graham essays in txt format (#4838 )	2024-01-09 16:23:05 +02:00

1 2

92 Commits