llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2024-12-27 20:04:35 +00:00

Author	SHA1	Message	Date
Slaren	ac184d5147	Always initialize mm_addr and mm_length in llama_model	2023-03-30 12:28:25 -07:00
Slaren	276e5b7811	Unmap the file in llama_free	2023-03-30 12:28:25 -07:00
Slaren	d68c5dc435	Make mmap_file static	2023-03-30 12:28:25 -07:00
Slaren	64bde3ffd4	Fix ggml_init_params in quantize	2023-03-30 12:28:25 -07:00
Slaren	c03ae8dca1	Add mmap support for model files	2023-03-30 12:28:25 -07:00
Stephan Walter	3bcc129ba8	cmake : properly invoke CTest (#629 )	2023-03-30 20:56:59 +03:00
Casey Primozic	a4755cf288	Remove unused variable (#607 ) * It seems some new warning were added recently that exposed this. I wrote the code that included this unused variable originally and it is indeed not needed.	2023-03-30 17:53:35 +00:00
david raistrick	1f0414feec	make : fix darwin f16c flags check (#615 ) ...there was no check. ported upstream from https://github.com/zanussbaum/gpt4all.cpp/pull/2 (I dont see any clean path for upstream patches)	2023-03-30 20:34:45 +03:00
Georgi Gerganov	77efdf5a50	ggml : fix NEON signs (close #620 , #622 )	2023-03-30 20:27:32 +03:00
slaren	ed3c680bcd	Fix GGML_F32Cx8_STORE in AVX without F16C path (#619 )	2023-03-30 11:16:30 +02:00
anzz1	9cbc404ba6	ci : re-enable AVX512 testing (Windows-MSVC) (#584 ) * CI: Re-enable AVX512 testing (Windows-MSVC) Now with 100% less base64 encoding * plain __cpuid is enough here	2023-03-29 23:44:39 +03:00
Georgi Gerganov	b51c717d5c	ggml : init time on first ggml_init() call	2023-03-29 22:15:34 +03:00
Georgi Gerganov	0ba76c1e73	llama : fix compile warnings when reading the vocab	2023-03-29 22:13:12 +03:00
Georgi Gerganov	cea1c85948	ggml : add ARM_NEON dequantize_row_q4_1()	2023-03-29 22:10:01 +03:00
Georgi Gerganov	f202ada131	ggml : add ARM_NEON quantize_row_q4_1()	2023-03-29 22:03:07 +03:00
Georgi Gerganov	3b44d30d9b	ggml : add ARM_NEON ggml_vec_dot_q4_1()	2023-03-29 22:03:07 +03:00
Pavol Rusnak	61cbfff5c9	rename convert_ggml_to_pth.py -> convert-ggml-to-pth.py (#600 ) to match filenames of other converters	2023-03-29 20:09:25 +02:00
Thérence	d9ad104440	Create chat-13B.bat (#592 ) * Create chat-13B.bat Same script than chat-13B.sh, but for windows users. Tested and working on windows 10/11 v 22H2 * Apply suggestions from code review --------- Co-authored-by: anzz1 <anzz1@live.com>	2023-03-29 20:21:09 +03:00
Georgi Gerganov	b467702b87	readme : fix typos	2023-03-29 19:38:31 +03:00
Georgi Gerganov	516d88e75c	readme : add GPT4All instructions (close #588 )	2023-03-29 19:37:20 +03:00
Georgi Gerganov	53635c081c	py : add GPT4All conversion script For now: copy-paste Too much time for me to deduplicate the python code	2023-03-29 19:29:52 +03:00
Maël Kerbiriou	41318d708e	llama : use the same threshold for OpenBLAS and ggml thread limiting (#577 )	2023-03-29 19:10:07 +03:00
Tobias Lütke	a6956b25a1	add example of re-act pattern (#583 ) * add example of re-act pattern * spelling... * fixed whitespace in reverse prompt issue	2023-03-29 10:10:24 -05:00
anzz1	83df5639eb	Fix GCC warning about binary literal (#595 ) 0b10101010 -> 0xAA /* 0b10101010 */	2023-03-29 13:20:07 +00:00
anzz1	a5c42c4b13	Fix typo in llama.h (#593 )	2023-03-29 13:19:29 +00:00
anzz1	5a5f8b1501	Enable Fused-Multiply-Add (FMA) and F16C/CVT16 vector extensions on MSVC (#375 ) * Enable Fused-Multiply-Add (FMA) instructions on MSVC __FMA__ macro does not exist in MSVC * Enable F16C/CVT16 vector extensions on MSVC __F16C__ macro does not exist in MSVC, but is implied with AVX2/AVX512 * MSVC cvt intrinsics * Add __SSE3__ macro for MSVC too because why not even though it's not currently used for anything when AVX is defined	2023-03-28 22:44:29 +03:00
anzz1	f1217055ea	CI: fix subdirectory path globbing (#546 ) - Changes in subdirectories will now be detecter properly - (Windows-MSVC) AVX512 tests temporarily disabled	2023-03-28 22:43:25 +03:00
anzz1	7f4c5c6651	llama : fix linkage with mingw (#551 ) * Revert `7e53955` (#542) Still needs to be fixed properly * Fix linking on mingw32	2023-03-28 21:23:09 +03:00
slaren	2a98bc18ea	ggml : add AVX2 implementation of quantize_row_q4_1 (#515 ) * Add AVX2 implementation of quantize_row_q4_1 * Actually use AVX2 * Make quantize_row_q4_1 static Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-03-28 21:06:03 +03:00
thement	d0aaff571c	py : add temporary script to convert old ggml files to newer version (#539 ) Co-authored-by: Jakub Horak <jakub.horak@ibawizard.net>	2023-03-28 20:55:42 +03:00
Tai Duc Nguyen	d0330fd783	py : add capabiliy to convert from ggml back to torch or hf format for further consumption/training/finetuning (#403 )	2023-03-28 20:51:29 +03:00
Stephan Walter	99c5b27654	ggml : refactor quantized processing functions (#509 ) * Refactor quantized processing functions * ggml : minor --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-03-28 20:13:01 +03:00
DooWoong Lee (David)	692ce3164e	py : removed unused `model` variable and verified that the code functions correctly with `vocab_only` setting. Also confirmed that the code works as expected after running with reduced memory usage due to deletion of no-longer-needed variable. (#547 )	2023-03-28 20:02:34 +03:00
Georgi Gerganov	96f9c0506f	ci : make ctest verbose, hopefully we see what is wrong with the sanitizer	2023-03-28 20:01:09 +03:00
Georgi Gerganov	d502bc7c9d	tests : free llama context at the end of the test	2023-03-28 19:51:55 +03:00
Stephan Walter	436e561931	all : be more strict about converting float to double (#458 ) * Be more strict about converting float to double * Test equivalence of round, SILU implementations Test module is commented out in CMakeLists.txt because the tests may take a long time, depending on how much the compiler optimizes. * Fix softmax in perplexity.cpp * all : prefer float over double where appropriate * perplexity : add <cmath> --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-03-28 19:48:20 +03:00
Jed Fox	20e1e84884	deploy : add a Package.swift for SwiftPM support (#393 ) * Add a Package.swift for SwiftPM support * Swap from exclusions to allowlist	2023-03-28 19:39:01 +03:00
Stephan Walter	c1f885067c	ggml : introduce structs for the q4 data blocks (#356 ) * Introduce structs for the q4 data blocks * ggml : rename quant struct variables + fix ARM_NEON --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-03-28 18:56:03 +03:00
Georgi Gerganov	e0670260fb	gitignore : add "embedding"	2023-03-28 18:34:35 +03:00
dotpy314	28ba975aea	Check the existence of f16_model_path_base in quantize.py (#574 ) Co-authored-by: Jincheng Miao <jincheng.miao@gmail.com>	2023-03-28 18:06:28 +03:00
slaren	a6bdc47cba	Fix usage of F16C intrinsics in AVX code (#563 ) * Fix usage of F16C intrinsics in AVX code when F16C is not defined	2023-03-28 17:26:55 +03:00
anzz1	7b8dbcb78b	main.cpp fixes, refactoring (#571 ) - main: entering empty line passes back control without new input in interactive/instruct modes - instruct mode: keep prompt fix - instruct mode: duplicate instruct prompt fix - refactor: move common console code from main->common	2023-03-28 17:09:55 +03:00
RJ Adriaansen	4b8efff0e3	Add embedding example to Makefile (#540 )	2023-03-28 09:11:09 +03:00
Marco Matthies	7e5395575a	Fix missing ggml link in cmake for examples/* on w64-mingw32 (#542 )	2023-03-27 07:55:26 +03:00
Erik Scholz	34c1072e49	ci: add debug build to sanitizer build matrix (#527 )	2023-03-26 15:48:40 +00:00
Stephan Walter	939ad2d3a5	Fix undefined variables in debug build, remove unused variables (#531 )	2023-03-26 15:34:02 +00:00
Juan Calderon-Perez	8c2ec5e21d	Add support for linux/arm64 platform during Docker Builds (#514 ) * Add support for linux/arm64 platform * Add platform to versioned builds	2023-03-26 14:48:42 +00:00
Stephan Walter	b391579db9	Update README and comments for standalone perplexity tool (#525 )	2023-03-26 16:14:01 +03:00
anzz1	7a87d31f4f	[main] fix infinite generation (-n == -1) (#523 )	2023-03-26 16:06:10 +03:00
Georgi Gerganov	348d6926ee	Add logo to README.md	2023-03-26 10:20:49 +03:00

... 36 37 38 39 40 ...

2112 Commits