klosax
76b46627e2
convert-llama-hf-to-gguf.py : special tokens
2023-08-20 16:54:42 +02:00
klosax
28b8c265eb
cmpnct_gpt2bpe.hpp : cleanup
2023-08-19 18:26:51 +02:00
klosax
c0a1269b7f
Update examples/server/README.md
...
Co-authored-by: slaren <slarengh@gmail.com>
2023-08-19 15:27:37 +02:00
klosax
6a2e520095
cmpnct_gpt2bpe.hpp : remove non-general stuff
2023-08-19 13:19:02 +02:00
klosax
8945d47f52
gptneox-main.cpp : fixes
2023-08-19 12:09:24 +02:00
klosax
781bf2481f
falcon-main.cpp : fixes
2023-08-19 12:08:17 +02:00
klosax
dadf098b5a
cmpnct_gpt2bpe.hpp : fixes
2023-08-19 12:06:22 +02:00
klosax
b3a7a2b486
convert-falcon-hf-to-gguf.py : add tensor data layout
2023-08-19 12:05:11 +02:00
klosax
2c8055b65b
convert-falcon-hf-to-gguf.py : update ref
2023-08-19 01:08:39 +02:00
klosax
1d80eea574
falcon-main.cpp : fix for falcon 40b
2023-08-19 01:03:37 +02:00
klosax
bd5a57901b
gguf.py : fix for falcon 40b
2023-08-19 01:01:52 +02:00
klosax
281d6d1105
convert-llama-hf-to-gguf.py : remove extra kv
2023-08-19 00:32:56 +02:00
klosax
593b04fdcd
convert-llama-7b-pth-to-gguf.py : remove extra kv
2023-08-19 00:32:27 +02:00
klosax
c0e4ca630b
convert-gptneox-hf-to-gguf.py : remove extra kv
2023-08-19 00:31:56 +02:00
klosax
16ab9ba3b3
convert-falcon-hf-to-gguf.py : remove extra kv
2023-08-19 00:31:28 +02:00
klosax
d5e976c12b
falcon-main.cpp : falcon inference example
2023-08-19 00:02:18 +02:00
klosax
fb7c883cd3
convert-falcon-hf-to-gguf.py : falcon HF --> gguf conversion, not tested
2023-08-18 20:14:01 +02:00
Georgi Gerganov
25b8a8922d
llama : introduce enum llama_vocab_type + remove hardcoded string constants
2023-08-18 18:46:38 +03:00
Georgi Gerganov
a4ad2bf35c
llama : fix MPI build
...
ggml-ci
2023-08-18 17:34:27 +03:00
Georgi Gerganov
5d2656d670
llama : avoid hardcoded special tokens
2023-08-18 17:29:20 +03:00
Georgi Gerganov
035d511457
llama : minor API updates
2023-08-18 17:10:20 +03:00
Georgi Gerganov
2d6c2c757c
llama : remove C++ API + reorganize common source in /common dir
2023-08-18 16:22:48 +03:00
Georgi Gerganov
38016ed9ec
Merge branch 'master' into gguf
2023-08-18 15:21:48 +03:00
Georgi Gerganov
660ca9bbca
llama : re-order functions
2023-08-18 14:56:36 +03:00
slaren
097e121e2f
llama : add benchmark example ( #2626 )
...
* llama : add benchmark example
* add to examples CMakeLists.txt
* fix msvc build
* add missing include
* add Bessel's correction to stdev calculation
Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
* improve markdown formatting
* add missing include
* print warning is NDEBUG is not defined
* remove n_prompt and n_gen from the matrix, use each value separately instead
* better checks for non-optimized builds
* llama.cpp : fix MEM_REQ_SCRATCH0 reusing the value of n_ctx of the first call
* fix json formatting
* add sql output
* add basic cpu and gpu info (linx/cuda only)
* markdown: also show values that differ from the default
* markdown: add build id
* cleanup
* improve formatting
* formatting
---------
Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
2023-08-18 12:44:58 +02:00
mdrokz
eaf98c2649
readme : add link to Rust bindings ( #2656 )
2023-08-18 13:17:58 +03:00
Georgi Gerganov
e9b12c332e
perplexity : more meaningful ETA number - 2 decimal points
2023-08-18 12:48:55 +03:00
Georgi Gerganov
dea5be61d7
editorconfig : fix whitespaces
2023-08-18 12:42:38 +03:00
Georgi Gerganov
e35f8c744e
tests : update vocab file with new magic
2023-08-18 12:39:22 +03:00
Georgi Gerganov
856afff746
Merge branch 'master' into gguf
2023-08-18 12:38:05 +03:00
Georgi Gerganov
aa3efe87c8
llama : print number of tensors per type + print arch + style
2023-08-18 10:36:45 +03:00
klosax
b275de745d
llama.cpp : get special token kv and linefeed token id
2023-08-18 03:34:30 +02:00
Evan Jones
604b8bdfa6
Fix unicode in grammars ( fixes #2501 ) ( #2553 )
...
* Fix unicode in grammars (fixes #2501 )
* add more comments
* fix test-llama-grammar
2023-08-17 19:54:44 -04:00
staviq
10151bee2e
server : support for saving templates in browser LocalStorage ( #2486 )
...
* support for templates in browser LocalStorage
* sync accepted #2409 fix from upstream
* convert autosave invocation to useEffect
* Apply suggestions from code review
Co-authored-by: Jhen-Jie Hong <iainst0409@gmail.com>
* Regen index.html.cpp, suggested from code review
---------
Co-authored-by: Jhen-Jie Hong <iainst0409@gmail.com>
2023-08-18 07:34:01 +08:00
klosax
306070c896
llama.cpp : print kv general.name
2023-08-18 01:06:27 +02:00
Johannes Gäßler
0992a7b8b1
README: fix LLAMA_CUDA_MMV_Y documentation ( #2647 )
2023-08-17 23:57:59 +02:00
klosax
d9e6890a51
test-tokenizer-0.cpp : fix warning
2023-08-17 23:34:21 +02:00
klosax
147a99bd3a
gguf.py : reverse GGUF_MAGIC
2023-08-17 23:24:04 +02:00
klosax
c20ae49b59
ggml.h : reverse GGUF_MAGIC
2023-08-17 23:23:17 +02:00
Henri Vasserman
6ddeefad9b
[Zig] Fixing Zig build and improvements ( #2554 )
...
* Fix zig after console.o was split
* Better include and flag management
* Change LTO to option
2023-08-17 23:11:18 +03:00
klosax
3c1b7217a9
convert-llama-7b-pth-to-gguf.py : fixes
2023-08-17 21:44:34 +02:00
klosax
9e2d4dd48e
convert-llama-hf-to-gguf.py : fixes
2023-08-17 21:43:48 +02:00
klosax
640ddc4259
gguf.py : gptneox mapping
2023-08-17 21:43:10 +02:00
klosax
b668cd3296
convert-gptneox-hf-to-gguf.py : fixes
2023-08-17 21:42:26 +02:00
M. Yusuf Sarıgöz
fc3a523211
gguf.py : write tensors in a single pass ( #2644 )
...
* gguf : single pass for writing tensors + refactoring writer
* gguf : single pass for writing tensors + refactoring writer
* gguf : single pass for writing tensors + refactoring writer
* gguf : style fixes in simple conversion script
* gguf : refactor gptneox conversion script
* gguf : rename h5 to hf (for HuggingFace)
* gguf : refactor pth to gguf conversion script
* gguf : rm file_type key and method
* gguf.py : fix vertical alignment
* gguf.py : indentation
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-08-17 21:57:39 +03:00
Georgi Gerganov
5484737d58
llama : fix tensor name grepping during quantization
...
ggml-ci
2023-08-17 21:40:51 +03:00
Georgi Gerganov
57eaadb853
llama : throw error if gguf fails to init from file
...
ggml-ci
2023-08-17 21:32:14 +03:00
klosax
b3cc182990
llama.cpp : typo
2023-08-17 20:27:50 +02:00
Georgi Gerganov
acaa98234a
convert.py : fix HF tensor permuting / unpacking
...
ggml-ci
2023-08-17 21:06:45 +03:00
klosax
78e1e57862
quantize-stats.cpp : .bin --> .gguf
2023-08-17 19:18:24 +02:00