Commit Graph

1217 Commits

Author SHA1 Message Date
Georgi Gerganov
25b8a8922d
llama : introduce enum llama_vocab_type + remove hardcoded string constants 2023-08-18 18:46:38 +03:00
Georgi Gerganov
a4ad2bf35c
llama : fix MPI build
ggml-ci
2023-08-18 17:34:27 +03:00
Georgi Gerganov
5d2656d670
llama : avoid hardcoded special tokens 2023-08-18 17:29:20 +03:00
Georgi Gerganov
035d511457
llama : minor API updates 2023-08-18 17:10:20 +03:00
Georgi Gerganov
2d6c2c757c
llama : remove C++ API + reorganize common source in /common dir 2023-08-18 16:22:48 +03:00
Georgi Gerganov
38016ed9ec
Merge branch 'master' into gguf 2023-08-18 15:21:48 +03:00
Georgi Gerganov
660ca9bbca
llama : re-order functions 2023-08-18 14:56:36 +03:00
slaren
097e121e2f
llama : add benchmark example (#2626)
* llama : add benchmark example

* add to examples CMakeLists.txt

* fix msvc build

* add missing include

* add Bessel's correction to stdev calculation

Co-authored-by: Johannes Gäßler <johannesg@5d6.de>

* improve markdown formatting

* add missing include

* print warning is NDEBUG is not defined

* remove n_prompt and n_gen from the matrix, use each value separately instead

* better checks for non-optimized builds

* llama.cpp : fix MEM_REQ_SCRATCH0 reusing the value of n_ctx of the first call

* fix json formatting

* add sql output

* add basic cpu and gpu info (linx/cuda only)

* markdown: also show values that differ from the default

* markdown: add build id

* cleanup

* improve formatting

* formatting

---------

Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
2023-08-18 12:44:58 +02:00
mdrokz
eaf98c2649
readme : add link to Rust bindings (#2656) 2023-08-18 13:17:58 +03:00
Georgi Gerganov
e9b12c332e
perplexity : more meaningful ETA number - 2 decimal points 2023-08-18 12:48:55 +03:00
Georgi Gerganov
dea5be61d7
editorconfig : fix whitespaces 2023-08-18 12:42:38 +03:00
Georgi Gerganov
e35f8c744e
tests : update vocab file with new magic 2023-08-18 12:39:22 +03:00
Georgi Gerganov
856afff746
Merge branch 'master' into gguf 2023-08-18 12:38:05 +03:00
Georgi Gerganov
aa3efe87c8
llama : print number of tensors per type + print arch + style 2023-08-18 10:36:45 +03:00
klosax
b275de745d
llama.cpp : get special token kv and linefeed token id 2023-08-18 03:34:30 +02:00
Evan Jones
604b8bdfa6
Fix unicode in grammars (fixes #2501) (#2553)
* Fix unicode in grammars (fixes #2501)

* add more comments

* fix test-llama-grammar
2023-08-17 19:54:44 -04:00
staviq
10151bee2e
server : support for saving templates in browser LocalStorage (#2486)
* support for templates in browser LocalStorage

* sync accepted #2409 fix from upstream

* convert autosave invocation to useEffect

* Apply suggestions from code review

Co-authored-by: Jhen-Jie Hong <iainst0409@gmail.com>

* Regen index.html.cpp, suggested from code review

---------

Co-authored-by: Jhen-Jie Hong <iainst0409@gmail.com>
2023-08-18 07:34:01 +08:00
klosax
306070c896
llama.cpp : print kv general.name 2023-08-18 01:06:27 +02:00
Johannes Gäßler
0992a7b8b1
README: fix LLAMA_CUDA_MMV_Y documentation (#2647) 2023-08-17 23:57:59 +02:00
klosax
d9e6890a51
test-tokenizer-0.cpp : fix warning 2023-08-17 23:34:21 +02:00
klosax
147a99bd3a
gguf.py : reverse GGUF_MAGIC 2023-08-17 23:24:04 +02:00
klosax
c20ae49b59
ggml.h : reverse GGUF_MAGIC 2023-08-17 23:23:17 +02:00
Henri Vasserman
6ddeefad9b
[Zig] Fixing Zig build and improvements (#2554)
* Fix zig after console.o was split

* Better include and flag management

* Change LTO to option
2023-08-17 23:11:18 +03:00
klosax
3c1b7217a9
convert-llama-7b-pth-to-gguf.py : fixes 2023-08-17 21:44:34 +02:00
klosax
9e2d4dd48e
convert-llama-hf-to-gguf.py : fixes 2023-08-17 21:43:48 +02:00
klosax
640ddc4259
gguf.py : gptneox mapping 2023-08-17 21:43:10 +02:00
klosax
b668cd3296
convert-gptneox-hf-to-gguf.py : fixes 2023-08-17 21:42:26 +02:00
M. Yusuf Sarıgöz
fc3a523211
gguf.py : write tensors in a single pass (#2644)
* gguf : single pass for writing tensors + refactoring writer

* gguf : single pass for writing tensors + refactoring writer

* gguf : single pass for writing tensors + refactoring writer

* gguf : style fixes in simple conversion script

* gguf : refactor gptneox conversion script

* gguf : rename h5 to hf (for HuggingFace)

* gguf : refactor pth to gguf conversion script

* gguf : rm file_type key and method

* gguf.py : fix vertical alignment

* gguf.py : indentation

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-08-17 21:57:39 +03:00
Georgi Gerganov
5484737d58
llama : fix tensor name grepping during quantization
ggml-ci
2023-08-17 21:40:51 +03:00
Georgi Gerganov
57eaadb853
llama : throw error if gguf fails to init from file
ggml-ci
2023-08-17 21:32:14 +03:00
klosax
b3cc182990
llama.cpp : typo 2023-08-17 20:27:50 +02:00
Georgi Gerganov
acaa98234a
convert.py : fix HF tensor permuting / unpacking
ggml-ci
2023-08-17 21:06:45 +03:00
klosax
78e1e57862
quantize-stats.cpp : .bin --> .gguf 2023-08-17 19:18:24 +02:00
klosax
fb11dd3f92
common.h : .bin --> .gguf 2023-08-17 19:16:35 +02:00
Georgi Gerganov
e72c8c2124
ggml : fix bug in gguf_set_kv
ggml-ci
2023-08-17 20:13:48 +03:00
Georgi Gerganov
899f9a5350
llama : fix lambda capture
ggml-ci
2023-08-17 19:49:45 +03:00
Georgi Gerganov
93f285bdf1
gptneox : move as a WIP example 2023-08-17 19:49:45 +03:00
Georgi Gerganov
81a2c2a6f4
llama : fix llama_model_loader memory leak 2023-08-17 19:49:02 +03:00
Georgi Gerganov
dd9e2fc988
ci : update ".bin" to ".gguf" extension
ggml-ci
2023-08-17 19:32:14 +03:00
Georgi Gerganov
c3b739374e
editorconfig : ignore models folder
ggml-ci
2023-08-17 19:17:25 +03:00
Georgi Gerganov
6d66ef96eb
Merge branch 'master' into gguf 2023-08-17 19:04:59 +03:00
Georgi Gerganov
11bf4366c2
llama : sync with recent PRs on master 2023-08-17 19:03:15 +03:00
Georgi Gerganov
8ace03ad3d
convert.py : better always have n_head_kv and default it to n_head 2023-08-17 18:47:06 +03:00
klosax
d646c4efce
convert.py : n_head_kv optional and .gguf file extension 2023-08-17 17:20:36 +02:00
Georgi Gerganov
dd016cc246
Revert "ci : disable CI temporary to not waste energy"
This reverts commit 7e82d25f40.
2023-08-17 17:23:16 +03:00
Georgi Gerganov
2ddd9681d6
convert.py : update to support GGUF output 2023-08-17 17:22:43 +03:00
Georgi Gerganov
e0429d38e4
convert-new.py : output gguf (#2635)
* convert-new.py : output gguf (WIP)

* convert-new.py : add gguf key-value pairs

* llama : add hparams.ctx_train + no longer print ftype

* convert-new.py : minor fixes

* convert-new.py : vocab-only option should work now

* llama : fix tokenizer to use llama_char_to_byte

* tests : add new ggml-vocab-llama.gguf

* convert-new.py : tensor name mapping

* convert-new.py : add map for skipping tensor serialization

* convert-new.py : convert script now works

* gguf.py : pick some of the refactoring from #2644

* convert-new.py : minor fixes
2023-08-17 17:19:52 +03:00
Kerfuffle
8dae7ce684
Add --cfg-negative-prompt-file option for examples (#2591)
Add --cfg-negative-prompt-file option for examples
2023-08-17 07:29:44 -06:00
klosax
d6fd53afd6
llama.cpp : use ggml_elements() 2023-08-17 15:24:35 +02:00
klosax
5a0a2c5685
llama.cpp : print actual model size 2023-08-17 15:18:16 +02:00