Commit Graph

  • 585d91a156
    cmake : add explicit F16C option (x86) (#576) master-585d91a anzz1 2023-04-13 15:48:21 +0300
  • 95ea26f6e9
    benchmark : add tool for timing q4_0 matrix multiplication (#653) master-95ea26f SebastianApel 2023-04-13 14:46:23 +0200
  • d21e188c6a
    Merge branch 'master' into master Georgi Gerganov 2023-04-13 15:45:33 +0300
  • 902075752a Add sentencepiece processor aeslampanah 2023-04-13 07:58:45 -0400
  • 7c8ee5aec5 Updated tokenconvert.py script to add support for SentencePiece and WordPiece tokenizers, updated arguments aeslampanah 2023-04-13 07:05:29 -0400
  • ed70ea9595
    Merge pull request #1 from ggerganov/master qwopqwop200 2023-04-13 19:09:27 +0900
  • 97d7ac7565 POC: Measure rmse of 8 bit quantization Iwan Kawrakow 2023-04-13 12:00:24 +0200
  • acbab12a89 replaced use of auto with exact type to avoid using -std=c++14 Arik Poznanski 2023-04-13 12:37:05 +0300
  • 82d146df9b
    do not force the prompt file to end with a new line (#908) Pavol Rusnak 2023-04-13 11:33:16 +0200
  • 9b52fc3aa6
    do not force the prompt file to end with a new line Pavol Rusnak 2023-04-12 08:33:13 +0200
  • ca297c190f up version Concedo 2023-04-13 14:38:38 +0800
  • c1b75f38d0 try to fix noavx2 for really old devices by Concedo 2023-04-13 14:36:00 +0800
  • 6f34961559 POC: Q4_1 for groups of 16 weight Iwan Kawrakow 2023-04-13 08:31:21 +0200
  • db4b29301c
    Add files via upload qwopqwop200 2023-04-13 15:30:14 +0900
  • 8b9316be70
    fix tab qwopqwop200 2023-04-13 15:16:39 +0900
  • e01b2d04c9
    fix tab qwopqwop200 2023-04-13 15:07:02 +0900
  • 75b39c4b26
    fix tab qwopqwop200 2023-04-13 15:05:34 +0900
  • b0c6171cd7
    fix tab qwopqwop200 2023-04-13 15:03:29 +0900
  • d405209c45
    add Q4_2 qwopqwop200 2023-04-13 14:55:16 +0900
  • ff0efc747d
    add Q4_2 qwopqwop200 2023-04-13 14:54:44 +0900
  • f0b14e8c69
    add Q4_2 qwopqwop200 2023-04-13 14:53:23 +0900
  • 716bd8fcfa
    Add files via upload qwopqwop200 2023-04-13 14:52:49 +0900
  • db77f1b48a
    add q4_2 qwopqwop200 2023-04-13 14:52:19 +0900
  • 8694318c71 try-catch jon-chuang 2023-04-13 13:14:15 +0800
  • f181c28edd fix jon-chuang 2023-04-13 13:01:18 +0800
  • 1caa4dcf94 commit jon-chuang 2023-04-13 12:55:38 +0800
  • 2ff91b5570 Merge remote-tracking branch 'occam/clblast-1' into concedo Concedo 2023-04-13 11:39:35 +0800
  • 5c22f7e4c4 reduce batch sizes and skip all intrinsic flags except AVX when building in compatibility mode. Concedo 2023-04-13 11:32:05 +0800
  • 5858d410fd
    Remove python 3.10 warning CRD716 2023-04-12 20:30:25 -0500
  • fe24af09ba Replaced static initialization of complex objects with a initialization on first use. This prevents an undefined behavior on program run, for example, crash in Release build, works in Debug build Arik Poznanski 2023-04-13 01:51:34 +0300
  • a1b4e48ba2 fixO wbpxre150 2023-04-13 05:46:56 +0800
  • f45dec3c2a fixes wbpxre150 2023-04-13 05:32:12 +0800
  • 67d220210f Revert buffer changes, no improvements in benchmarks 0cc4m 2023-04-12 23:09:12 +0200
  • c7e5c4f7b2 Improve ClBlast implementation, avoid recreating buffers, remove redundant transfers 0cc4m 2023-04-11 21:53:50 +0200
  • 0d903962e5 add command line to interactive mode. You can specify diffrent values for everything in params during rumtime. wbpxre150 2023-04-13 04:58:35 +0800
  • 47d809c692 move code to signal handler wbpxre150 2023-04-13 02:03:48 +0800
  • f4257a8eef Merge branch 'master' into concedo Concedo 2023-04-12 23:25:45 +0800
  • 1bd5992da4 clean and refactor handling of flags Concedo 2023-04-12 23:25:31 +0800
  • 679e1cb6c0 POC: Even lower rmse 4-bit Q4_0 quantization Iwan Kawrakow 2023-04-12 17:10:52 +0200
  • 96846dd2ff Remove <alloca.h>. Olaf Seibert 2023-04-12 17:01:30 +0200
  • 7d21d5ebb4 Eliminate alloca from ggml_opt_lbfgs() Olaf Seibert 2023-04-12 16:47:45 +0200
  • 0260aa67fc Eliminate alloca from ggml_graph_compute() Olaf Seibert 2023-04-12 16:35:23 +0200
  • a0f3de5b84 This file doesn't even use alloca. Olaf Seibert 2023-04-12 16:32:47 +0200
  • e7f6997f89
    Don't crash on ftype (formerly f16) == 4 (#917) master-e7f6997 Stephan Walter 2023-04-12 15:06:16 +0000
  • 99da5491c5 Don't crash on ftype (formerly f16) == 4 Stephan Walter 2023-04-12 16:47:00 +0200
  • 29b83e5fd6 Various experiments, including 5-bit qunatization Iwan Kawrakow 2023-04-12 16:25:19 +0200
  • 636f8e5a8e updated the quantize files and makefile Concedo 2023-04-12 21:40:25 +0800
  • f76cb3a34d
    readme : change "GPU support" link to discussion Georgi Gerganov 2023-04-12 14:48:57 +0300
  • 782438070f
    readme : update hot topics with link to "GPU support" issue Georgi Gerganov 2023-04-12 14:31:12 +0300
  • 4faae0afa9 Merged upstream, fixed OSX compile errors, integrated noavx2 build into main Concedo 2023-04-12 18:08:55 +0800
  • bcd327c221 chore: add nodejs binding hlhr202 2023-04-12 16:27:45 +0800
  • 24f2a6e03d chore: add nodejs binding hlhr202 2023-04-12 16:27:01 +0800
  • 2444a99db5
    Fix make compile error in expose.cpp(?) (#44) rabidcopy 2023-04-12 03:19:38 -0500
  • c55eb784cb add exit call to interactive mode. wbpxre150 2023-04-12 15:29:59 +0800
  • 4dbbd40750
    readme: link to sha256sums file (#902) Nicolai Weitkemper 2023-04-12 08:46:20 +0200
  • 6bfb00a53b Further improve Q4_0 MSE Iwan Kawrakow 2023-04-12 07:38:42 +0200
  • a2a2b2cf13
    readme: link to sha256sums file Nicolai Weitkemper 2023-04-12 01:26:48 +0200
  • c59009a835 apply suggestions Tomáš Pazdiora 2023-04-11 23:19:27 +0200
  • ab73745993 update formatting Tomáš Pazdiora 2023-04-11 23:08:25 +0200
  • c4c7ead0bd update formating Tomáš Pazdiora 2023-04-11 23:06:38 +0200
  • 414b66fcc4 Merge remote-tracking branch 'origin/master' into cli-ui-update Tomáš Pazdiora 2023-04-11 22:55:36 +0200
  • 931ae36050 Improve Q4_0 MSE Iwan Kawrakow 2023-04-11 22:08:47 +0200
  • 8b679987cd
    Fix whitespace, add .editorconfig, add GitHub workflow (#883) master-8b67998 Pavol Rusnak 2023-04-11 21:45:44 +0200
  • b6df974577 Reverting round() change so we can pass tests Iwan Kawrakow 2023-04-11 20:38:14 +0200
  • 303c5caa0e
    Fix whitespace, add .editorconfig, add GitHub workflow Pavol Rusnak 2023-04-10 22:09:54 +0200
  • ca69e05d1f update readme and fixed typos Concedo 2023-04-11 23:53:21 +0800
  • 9245c7d7d0 Merge branch 'master' into concedo Concedo 2023-04-11 23:38:15 +0800
  • 23c675b2e6 integrated optional (experimentl) CLBlast support Concedo 2023-04-11 23:33:44 +0800
  • 3e6e70d8e8
    Add enum llama_ftype, sync ggml_type to model files (#709) master-3e6e70d Stephan Walter 2023-04-11 15:03:51 +0000
  • 709d23543a Add new quantization to quantize Iwan Kawrakow 2023-04-11 15:32:41 +0200
  • 92408cd983 Add ability to use new quantization in quantize-stats Iwan Kawrakow 2023-04-11 13:03:51 +0200
  • 8b3d1f977b Remove forgotten remnant from a discarded change to ggml.c Iwan Kawrakow 2023-04-11 13:01:22 +0200
  • 0c9a967a20 Adding new functions for Q4_0 and Q4_1 quantization Iwan Kawrakow 2023-04-11 12:44:47 +0200
  • 126b984482 Use better conversion to ints Iwan Kawrakow 2023-04-11 12:41:14 +0200
  • 2663d2c678
    Windows fixes (#890) master-2663d2c comex 2023-04-11 06:19:54 -0700
  • c9f18082fd Merge remote-tracking branch 'occam/clblast' into concedo Concedo 2023-04-11 17:01:31 +0800
  • 1f6aa47b6e Merge branch 'master' into concedo Concedo 2023-04-11 16:53:41 +0800
  • 471789ab54 requirements.txt Eric Hartford 2023-04-10 21:58:15 -0700
  • fb7ad401b6 Windows fixes comex 2023-04-10 19:16:39 -0700
  • 67cc4cb475 minimize the changes. zjli2019 2023-04-11 10:20:03 +0800
  • 5541a48d49 Merge remote-tracking branch 'upstream/master' into eval-thread-count ml6 2023-04-10 17:08:31 -0700
  • cb92071b2a Fixed rlimit error message. apaz-cli 2023-04-10 18:59:36 -0500
  • 40e9d4acba
    Merge 6e65b8a817 into a0caa34b16 John 2023-04-10 22:37:56 +0000
  • 6e65b8a817 Updated preloader to use multithreading Tested on Windows - a small performance hit during loading is not avoidable but this is the fastest method I found On Linux - madvise needs a test if it's working. otherwise readahead() needs to be implemented in the TODO region John 2023-04-11 00:36:00 +0200
  • f4c1c6b97a Updated preloader to use multithreading - currently set to 50% of the available threads on the system Tested on Windows - a small performance during loading is not avoidable but this is the best possible solution On Linux John 2023-04-11 00:28:04 +0200
  • 74b92ff6b8 Add helper script to convert hf (pytorch) models into ggml format aeslampanah 2023-04-10 17:21:33 -0400
  • 0f934b79de
    Merge 94ddd6204c into a0caa34b16 Howard Su 2023-04-10 22:58:20 +0200
  • a0caa34b16
    Add BAIR's Koala to supported models (#877) qouoq 2023-04-11 04:41:53 +0800
  • 461ba9e66e
    ggml : fix WASM build master-461ba9e Georgi Gerganov 2023-04-10 23:20:01 +0300
  • 55ffe2e46c
    Fix whitespace, add .editorconfig, add GitHub workflow Pavol Rusnak 2023-04-10 22:09:54 +0200
  • c3ac702e5e
    ggml : add ggml_cont() + optimize ggml_cpy() for contiguous dst master-c3ac702 Georgi Gerganov 2023-04-10 22:40:28 +0300
  • 9d634ef452
    ggml : remove trailing whitespaces Georgi Gerganov 2023-04-10 19:32:45 +0300
  • d9a239c410
    Simplify to include lower-case windows.h always, fix compile on mingw32 (#747) master-d9a239c Marco Matthies 2023-04-10 19:57:59 +0200
  • 417e3f409d
    Merge branch 'master' into fix-mingw32-includes Pavol Rusnak 2023-04-10 19:55:10 +0200
  • 5f7b3837a6
    Add BAIR's Koala to supported models qouoq 2023-04-11 01:53:45 +0800
  • 684da25926
    ggml : fix quantize_row_q4_1() ARM_NEON (close #876) master-684da25 Georgi Gerganov 2023-04-10 19:29:48 +0300
  • c3db99ea32 Allow use of OpenCL GPU-based BLAS using ClBlast instead of OpenBLAS for context processing 0cc4m 2023-04-10 09:49:40 +0200
  • 69b85f5b61 fixed a few OOM errors with larger contexts - I cannot figure out why they happen, so I am forced to increase the buffer size. Concedo 2023-04-11 00:14:57 +0800
  • 776b2cb135 Add enum llama_ftype, sync ggml_type to model files Stephan Walter 2023-04-02 13:37:23 +0200
  • 94ddd6204c Simplify the logic of scheduling Howard Su 2023-04-10 22:37:37 +0800