Commit Graph

210 Commits

Author SHA1 Message Date
Stephan Walter
a50e39c6fe
Revert "Delete SHA256SUMS for now" (#429)
* Revert "Delete SHA256SUMS for now (#416)"

This reverts commit 8eea5ae0e5.

* Remove ggml files until they can be verified
* Remove alpaca json
* Add also model/tokenizer.model to SHA256SUMS + update README

---------

Co-authored-by: Pavol Rusnak <pavol@rusnak.io>
2023-03-23 15:15:48 +01:00
Gary Mulder
8a3e5ef801
Move model section from issue template to README.md (#421)
* Update custom.md

* Removed Model section as it is better placed in README.md

* Updates to README.md model section

* Inserted text that was removed from  issue template about obtaining models from FB and links to papers describing the various models

* Removed IPF down links for the Alpaca 7B models as these look to be in the old data format and probably shouldn't be directly linked to, anyway

* Updated the perplexity section to point at Perplexity scores #406 discussion
2023-03-23 11:30:40 +00:00
Georgi Gerganov
93208cfb92
Adjust repetition penalty .. 2023-03-23 10:46:58 +02:00
Georgi Gerganov
03ace14cfd
Add link to recent podcast about whisper.cpp and llama.cpp 2023-03-23 09:48:51 +02:00
Gary Linscott
40ea807a97
Add details on perplexity to README.md (#395) 2023-03-22 08:53:54 -07:00
Georgi Gerganov
56817b1f88
Remove temporary notice and update hot topics 2023-03-22 07:34:02 +02:00
Gary Mulder
da0e9fe90c Add SHA256SUMS file and instructions to README how to obtain and verify the downloads
Hashes created using:

sha256sum models/*B/*.pth models/*[7136]B/ggml-model-f16.bin* models/*[7136]B/ggml-model-q4_0.bin* > SHA256SUMS
2023-03-21 23:19:11 +01:00
Georgi Gerganov
3366853e41
Add notice about pending change 2023-03-21 22:57:35 +02:00
Georgi Gerganov
1daf4dd712
Minor style changes 2023-03-21 18:10:32 +02:00
Georgi Gerganov
dc6a845b85
Add chat.sh script 2023-03-21 18:09:46 +02:00
Georgi Gerganov
3bfa3b43b7
Fix convert script, warnings alpaca instructions, default params 2023-03-21 17:59:16 +02:00
Kevin Kwok
e0ffc861fa
Update IPFS links to quantized alpaca with new tokenizer format (#352) 2023-03-21 17:34:49 +02:00
Mack Straight
074bea2eb1
sentencepiece bpe compatible tokenizer (#252)
* potential out of bounds read

* fix quantize

* style

* Update convert-pth-to-ggml.py

* mild cleanup

* don't need the space-prefixing here rn since main.cpp already does it

* new file magic + version header field

* readme notice

* missing newlines

Co-authored-by: slaren <2141330+slaren@users.noreply.github.com>
2023-03-20 03:17:23 -07:00
Suaj Carrot
7392f1cd2c
Improved quantize script (#222)
* Improved quantize script

I improved the quantize script by adding error handling and allowing to select many models for quantization at once in the command line. I also converted it to Python for generalization as well as extensibility.

* Fixes and improvements based on Matt's observations

Fixed and improved many things in the script based on the reviews made by @mattsta. The parallelization suggestion is still to be revised, but code for it was still added (commented).

* Small fixes to the previous commit

* Corrected to use the original glob pattern

The original Bash script uses a glob pattern to match files that have endings such as ...bin.0, ...bin.1, etc. That has been translated correctly to Python now.

* Added support for Windows and updated README to use this script

New code to set the name of the quantize script binary depending on the platform has been added (quantize.exe if working on Windows) and the README.md file has been updated to use this script instead of the Bash one.

* Fixed a typo and removed shell=True in the subprocess.run call

Fixed a typo regarding the new filenames of the quantized models and removed the shell=True parameter in the subprocess.run call as it was conflicting with the list of parameters.

* Corrected previous commit

* Small tweak: changed the name of the program in argparse

This was making the automatic help message to be suggesting the program's usage as being literally "$ Quantization Script [arguments]". It should now be something like "$ python3 quantize.py [arguments]".
2023-03-19 20:38:44 +02:00
Georgi Gerganov
160bfb217d
Update hot topics to mention Alpaca support 2023-03-19 19:51:55 +02:00
Georgi Gerganov
a4e63b73df
Add instruction for using Alpaca (#240) 2023-03-19 18:49:50 +02:00
Pavol Rusnak
6f61c18ec9 Fix typo in readme 2023-03-18 23:18:04 +01:00
Pavol Rusnak
1e5a6d088d Add note about Python 3.11 to readme 2023-03-18 22:25:35 +01:00
Pavol Rusnak
554b541521 Add memory/disk requirements to readme 2023-03-18 22:25:35 +01:00
Georgi Gerganov
e81b9c81c1
Update Contributing section 2023-03-17 20:30:04 +02:00
Stephan Walter
367946c668
Don't tell users to use a bad number of threads (#243)
The readme tells people to use the command line option "-t 8", causing 8
threads to be started. On systems with fewer than 8 cores, this causes a
significant slowdown. Remove the option from the example command lines
and use /proc/cpuinfo on Linux to determine a sensible default.
2023-03-17 19:47:35 +02:00
Bernat Vadell
2af23d3043
🚀 Dockerize llamacpp (#132)
* feat: dockerize llamacpp

* feat: split build & runtime stages

* split dockerfile into main & tools

* add quantize into tool docker image

* Update .devops/tools.sh

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* add docker action pipeline

* change CI to publish at github docker registry

* fix name runs-on macOS-latest is macos-latest (lowercase)

* include docker versioned images

* fix github action docker

* fix docker.yml

* feat: include all-in-one command tool & update readme.md

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-03-17 10:47:06 +01:00
Georgi Gerganov
721311070e
Update README.md 2023-03-16 15:00:09 +02:00
Georgi Gerganov
ac15de7895
Expand "Contributing" section 2023-03-16 08:55:13 +02:00
Georgi Gerganov
273abc47ff
Update hot topics - RMSnorm 2023-03-16 07:12:12 +02:00
moritzbrantner
27944c4206
fixed typo (#178) 2023-03-15 22:35:25 +02:00
Musab Gultekin
977295c700
Fix potential licensing issue (#126)
* Update README.md

* Update README.md

remove facebook
2023-03-15 21:39:06 +02:00
Radoslav Gerganov
60f819a2b1
Add section to README on how to run the project on Android (#130) 2023-03-14 15:30:08 +02:00
Georgi Gerganov
97ab2b2578
Add Misc section + update hot topics + minor fixes 2023-03-14 09:43:52 +02:00
Georgi Gerganov
7ec903d3c1
Update contribution section, hot topics, limitations, etc. 2023-03-13 19:21:51 +02:00
Pavol Rusnak
d1f224712d
Add quantize script for batch quantization (#92)
* Add quantize script for batch quantization

* Indentation

* README for new quantize.sh

* Fix script name

* Fix file list on Mac OS

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-03-13 18:15:20 +02:00
Georgi Gerganov
1808ee0500
Add initial contribution guidelines 2023-03-13 09:42:26 +02:00
Georgi Gerganov
1a0a74300f
Update README.md 2023-03-12 23:39:01 +02:00
Matvey Soloviev
96ea727f47
Add interactive mode (#61)
* Initial work on interactive mode.

* Improve interactive mode. Make rev. prompt optional.

* Update README to explain interactive mode.

* Fix OS X build
2023-03-12 23:13:28 +02:00
Marc Köhlbrugge
9661954835
Fix typo in README (#45) 2023-03-12 22:30:08 +02:00
Georgi Gerganov
7027a97837
Update README.md 2023-03-12 22:09:26 +02:00
Georgi Gerganov
7c9e54e55e
Revert "weights_only" arg - this causing more trouble than help 2023-03-12 20:59:01 +02:00
Oleksandr Nikitin
b9bd1d0141
python/pytorch compat notes (#44) 2023-03-12 14:16:33 +02:00
Georgi Gerganov
702fddf5c5
Clarify meaning of hacking 2023-03-12 09:03:25 +02:00
Georgi Gerganov
7d86e25bf6
README: add "Supported platforms" + update hot topics 2023-03-12 08:41:54 +02:00
Georgi Gerganov
da1a4ff01f
Update README.md 2023-03-12 01:26:32 +02:00
Juraj Bednar
6b2cb6302f
Fix a typo in model name (#16) 2023-03-11 19:32:20 +02:00
Georgi Gerganov
4235e3d5b3
Update README.md 2023-03-11 18:10:18 +02:00
Georgi Gerganov
f1eaff4721 Add AVX2 support for x86 architectures thanks to @Const-me ! 2023-03-11 18:04:25 +02:00
Georgi Gerganov
0c6803321c
Update README.md 2023-03-11 12:31:21 +02:00
Georgi Gerganov
7211862c94
Update Makefile var + add comment 2023-03-11 12:27:02 +02:00
Georgi Gerganov
a5c5ae2f54
Update README.md 2023-03-11 11:34:25 +02:00
Georgi Gerganov
ea977e85ec
Update README.md 2023-03-11 11:34:11 +02:00
Georgi Gerganov
007a8f6f45
Support all LLaMA models + change Q4_0 quantization storage 2023-03-11 11:28:30 +02:00
Simon Willison
5f2f970d51
Include Python dependencies in README (#6) 2023-03-11 07:47:26 +02:00
Georgi Gerganov
73c6ed5e87
Update README.md 2023-03-11 01:30:47 +02:00
Georgi Gerganov
01eeed8fb1
Update README.md 2023-03-11 01:22:58 +02:00
Georgi Gerganov
6da2df34ee
Update README.md 2023-03-11 01:18:10 +02:00
Georgi Gerganov
920a7fe2d9
Update README.md 2023-03-11 00:55:22 +02:00
Georgi Gerganov
3a57ee59de
Update README.md 2023-03-11 00:51:46 +02:00
Georgi Gerganov
b85028522d
Update README.md 2023-03-11 00:09:19 +02:00
Georgi Gerganov
8a01f565ff
Update README.md 2023-03-10 23:53:11 +02:00
Georgi Gerganov
18ebda34d6
Update README.md 2023-03-10 21:52:27 +02:00
Georgi Gerganov
319cdb3e1f
Final touches 2023-03-10 21:50:46 +02:00
Georgi Gerganov
775328064e
Create README.md 2023-03-10 21:47:46 +02:00