Commit Graph

  • b208c252a2
    Merge branch 'master' into jon/tall-and-skinny-matmul jon-chuang 2023-04-16 02:10:58 +0800
  • 00cca253e3 Made the assessors functions for static maps be static const Arik Poznanski 2023-04-15 20:48:26 +0300
  • ad5676810a merge CLBlast improvements - GPU dequant Concedo 2023-04-16 01:17:40 +0800
  • 3e992eabb4 Merge remote-tracking branch 'occam/clblast-gpu-dequant' into concedo Concedo 2023-04-16 00:26:54 +0800
  • 0ad964631f
    Refactor ggml.c for future tensor types (#1001) master-0ad9646 Stephan Walter 2023-04-15 16:25:38 +0000
  • 3eb1c1850e accept non positional model arg Concedo 2023-04-16 00:23:07 +0800
  • 524b2011d7 QK40 -> QK4_0 etc. Stephan Walter 2023-04-15 18:15:23 +0200
  • 81edec9776 done jon-chuang 2023-04-16 00:12:04 +0800
  • 57d046eeb6 Enable dequantization on GPU for ClBlast 0cc4m 2023-04-15 12:03:11 +0200
  • 472145c707 Refactor ggml.c for future tensor types Stephan Walter 2023-04-15 15:27:53 +0200
  • e95b6554b4
    ggml : add Q8_0 quantization for intermediate results (#951) master-e95b655 Georgi Gerganov 2023-04-15 17:53:22 +0300
  • 02258616ef minor jon-chuang 2023-04-15 22:27:23 +0800
  • a38b9d7fab minor jon-chuang 2023-04-15 21:58:10 +0800
  • 6bf6543a6a format jon-chuang 2023-04-15 21:57:39 +0800
  • 00e86b97cc commit jon-chuang 2023-04-15 21:54:37 +0800
  • 60f27ed887
    ggml : fix q4_1 dot func Georgi Gerganov 2023-04-15 15:06:38 +0300
  • 69511b2c4a Merge branch 'master' into jon/tall-and-skinny-matmul jon-chuang 2023-04-15 19:57:48 +0800
  • 73e7601bf3 stash jon-chuang 2023-04-15 19:57:01 +0800
  • 01de5c5420
    quantize-stats : delete obsolete strings Georgi Gerganov 2023-04-14 21:57:05 +0300
  • 3a111abd63
    minor : updates after rebase to latest master Georgi Gerganov 2023-04-14 21:38:24 +0300
  • 312a927f0b
    ggml : fix quantize_row_q8_0() ARM_NEON rounding Georgi Gerganov 2023-04-14 21:27:55 +0300
  • 2c4f9b658d
    Q8: use int8_t, AVX/AVX2 optimizations Stephan Walter 2023-04-13 18:55:55 +0200
  • 19e7a6575d
    quantize-stats : fix test + add it to Makefile default Georgi Gerganov 2023-04-13 23:33:35 +0300
  • 3b894ec657
    ggml : add Q8_0 quantization for intermediate results Georgi Gerganov 2023-04-13 23:03:27 +0300
  • aa485cee33
    ggml : use posix_memalign on non-Windows env master-aa485ce Georgi Gerganov 2023-04-15 14:25:45 +0300
  • 8fbfc80e03 Fix clblast device selection on Linux 0cc4m 2023-04-15 12:02:36 +0200
  • c12b14b77f
    benchmark : fix result validation in benchmark-q4_0-matmult (#987) master-c12b14b Ivan Komarov 2023-04-15 07:51:54 +0200
  • 106faaf297
    cmake : add finding the OpenBLAS header file (#992) master-106faaf katsu560 2023-04-15 14:51:11 +0900
  • d00b865eb1 Merge branch 'master' into concedo Concedo 2023-04-15 11:33:43 +0800
  • 9c6118c3fc convert.py: Fix loading safetensors and ggml format on Windows comex 2023-04-14 18:40:57 -0700
  • 185dc24a19 add finding the OpenBLAS header file katsu560 2023-04-15 10:59:20 +0900
  • c90261449f fix conflict wbpxre150 2023-04-15 09:19:46 +0800
  • d071650c2c
    Merge branch 'master' into test wbpxre150 2023-04-15 08:50:11 +0800
  • b148cd1eba Fix potential int8 overflow in non-SIMD vec_dot Stephan Walter 2023-04-14 22:31:38 +0200
  • 0ef5704ea7 Fix result validation in benchmark-q4_0-matmult Ivan Komarov 2023-04-14 22:19:15 +0200
  • c85e03d12e
    Revert "main : alternative instruct mode (Vicuna support, etc.) (#863)" (#982) master-c85e03d Pavol Rusnak 2023-04-14 21:58:43 +0200
  • 489093548c
    py : bump sentencepiece to 0.1.98 to support Python 3.11 (#976) Pavol Rusnak 2023-04-14 21:46:49 +0200
  • 93265e988a
    make : fix dependencies, use auto variables (#983) master-93265e9 Stephan Walter 2023-04-14 19:39:48 +0000
  • dcf397c313
    Revert "main : alternative instruct mode (Vicuna support, etc.) (#863)" Pavol Rusnak 2023-04-14 21:23:03 +0200
  • 814d411ea1 Makefile: fix dependencies, use auto variables Stephan Walter 2023-04-14 21:20:04 +0200
  • 4d0f761a4c
    nix: use convert.py instead of legacy wrapper convert-pth-to-ggml.py Pavol Rusnak 2023-04-14 21:17:35 +0200
  • 218201acfd
    py : bump sentencepiece to 0.1.98 to support Python 3.11 Pavol Rusnak 2023-04-14 20:42:47 +0200
  • 59fb9e9eb8
    ggml : fix quantize_row_q8_0() ARM_NEON rounding Georgi Gerganov 2023-04-14 21:27:55 +0300
  • c56b715269
    Expose type name from ggml (#970) master-c56b715 Pavol Rusnak 2023-04-14 20:05:37 +0200
  • 327940beae add command line mode wbpxre150 2023-04-15 02:05:04 +0800
  • 801aab14aa Q8: use int8_t, AVX/AVX2 optimizations Stephan Walter 2023-04-13 18:55:55 +0200
  • 949dec0e42
    Merge branch 'master' into wbpxre150 wbpxre150 2023-04-15 01:20:25 +0800
  • ea5d01002f Merge branch 'concedo' of https://github.com/LostRuins/llamacpp-for-kobold into concedo Concedo 2023-04-15 01:14:10 +0800
  • 8dc06c7ab3 Fixed compile error in OSX Concedo 2023-04-15 01:13:56 +0800
  • 624dc8809e
    Added openblas and clblas package names for debian (#63) AlpinDale 2023-04-14 21:38:56 +0430
  • 58b91f1011
    Expose type name from ggml Håkon H. Hitland 2023-04-14 00:33:30 +0200
  • c3b810868d fixed an offset bug? Concedo 2023-04-15 00:30:00 +0800
  • f4d277ae17
    main : alternative instruct mode (Vicuna support, etc.) (#863) master-f4d277a Tomáš Pazdiora 2023-04-14 17:19:17 +0200
  • 1b1c0730f5 Idk why people keep thinking its an error lol. Concedo 2023-04-14 22:58:45 +0800
  • 1003c971ad update embedded kobold lite Concedo 2023-04-14 22:54:16 +0800
  • c9a59b70a5
    ggml : add unary and binary map operations (#874) master-c9a59b7 Kerfuffle 2023-04-14 08:43:55 -0600
  • 932d981222 more make targets Concedo 2023-04-14 21:54:18 +0800
  • a819f22cac Merge branch 'master' into concedo Concedo 2023-04-14 21:40:33 +0800
  • a32f7acc9f
    py : cleanup dependencies (#962) Pavol Rusnak 2023-04-14 15:37:11 +0200
  • 8ad42a1102 read from inputs Concedo 2023-04-14 21:30:26 +0800
  • adb4df78d6 Added SmartContext mode, a way of prompt context manipulation that avoids frequent context recalculation. Concedo 2023-04-14 21:24:16 +0800
  • 995fe0303e
    py : cleanup dependencies Pavol Rusnak 2023-04-14 10:35:00 +0200
  • 43ffdefb74
    py : fix flake8 and isort nitpicks (#960) Pavol Rusnak 2023-04-14 14:23:21 +0200
  • 3bbcbe441f Merge remote-tracking branch 'origin/master' into cli-ui-update Tomáš Pazdiora 2023-04-14 14:00:12 +0200
  • 1623a6e9b4
    ggml : minor master-1623a6e Georgi Gerganov 2023-04-14 13:31:29 +0300
  • c14e0d2f23
    ggml : always allocate buffers with size multiple of GGML_MEM_ALIGN Georgi Gerganov 2023-04-14 13:31:15 +0300
  • 7d03e6e417 Fix position of map ops cases in ggml_compute_forward KerfuffleV2 2023-04-14 04:01:50 -0600
  • 7d695973a5 Various cleanups. KerfuffleV2 2023-04-14 03:55:46 -0600
  • 1c73d4eec7 GGML map ops proof of concept. KerfuffleV2 2023-04-10 07:10:01 -0600
  • dea05626de
    py : fix flake8 and isort nitpicks Pavol Rusnak 2023-04-14 10:28:04 +0200
  • 723dac55fa
    py : new conversion script (#545) comex 2023-04-14 00:03:03 -0700
  • 0f07cacb05
    ggml : fix q4_1 dot product types master-0f07cac Georgi Gerganov 2023-04-14 09:45:42 +0300
  • c5d70f5c9e
    ggml : optimize rope function to avoid call powf in the tight loop (#807) master-c5d70f5 Howard Su 2023-04-14 14:24:52 +0800
  • acb404ce4b
    Merge pull request #3 from wbpxre150/master wbpxre150 2023-04-14 12:52:21 +0800
  • 030306d36a
    Merge pull request #2 from wbpxre150/wbpxre150 wbpxre150 2023-04-14 12:50:29 +0800
  • 18cf46c781
    Merge pull request #1 from ggerganov/master wbpxre150 2023-04-14 12:48:36 +0800
  • 10e73b08be fix conflict wbpxre150 2023-04-14 12:41:33 +0800
  • e6468f95c1 whitespace wbpxre150 2023-04-14 12:38:56 +0800
  • aa6bca453f Fix prints. wbpxre150 2023-04-14 12:36:20 +0800
  • 241065eccd New conversion script (#545) comex 2023-04-13 21:06:30 -0700
  • e524ce99fe add macos headers jon-chuang 2023-04-14 10:22:52 +0800
  • be87b6ed20
    perplexity : add support for batch size to --perplexity (#407) master-be87b6e Gary Linscott 2023-04-13 14:50:42 -0700
  • 3f93a00d9d
    quantize-stats : fix test + add it to Makefile default Georgi Gerganov 2023-04-13 23:33:35 +0300
  • a520b33b3a
    ggml : add Q8_0 quantization for intermediate results Georgi Gerganov 2023-04-13 23:03:27 +0300
  • 3fa8837068 improve jon-chuang 2023-04-14 04:00:41 +0800
  • 02b0fe86f2 improve jon-chuang 2023-04-14 03:55:33 +0800
  • b17d54eda3 Merge branch 'master' of https://github.com/ggerganov/llama.cpp into jon/use-hardware-cores jon-chuang 2023-04-14 03:07:49 +0800
  • e0325353be apply code review jon-chuang 2023-04-14 03:07:45 +0800
  • 315e69fd7f fix code indentation. wbpxre150 2023-04-14 01:58:38 +0800
  • fa651909bb refector code into function. wbpxre150 2023-04-14 01:24:32 +0800
  • 0e07e6a839
    common : remove unnecessary includes (#947) master-0e07e6a CRD716 2023-04-13 10:39:25 -0500
  • a3a2a0eda8
    ggml : add GGML_DEFAULT_N_THREADS master-a3a2a0e Georgi Gerganov 2023-04-13 18:36:40 +0300
  • d990e3fffc
    ggml : speed-up ggml_vec_dot_q4_1() ARM_NEON + 32-bit ARM support (#900) master-d990e3f Georgi Gerganov 2023-04-13 18:32:36 +0300
  • 99e7c9b9e6
    ggml : try to use correct ifdef Georgi Gerganov 2023-04-13 18:31:15 +0300
  • 63f7ecf47c
    ggml : fix comment Georgi Gerganov 2023-04-13 18:28:18 +0300
  • 1b59a07380
    ggml : implement vzip when missing Georgi Gerganov 2023-04-13 18:26:44 +0300
  • 23fd782d35 Update batch size for efficiency Gary Linscott 2023-04-13 08:20:54 -0700
  • be21d538e6
    ggml : implement vminvq and vmaxvq when missing Georgi Gerganov 2023-04-13 18:19:20 +0300
  • 14a0b207bc
    ggml : implement vaddvq when missing Georgi Gerganov 2023-04-13 18:16:35 +0300
  • fbcecd59a9 Merge remote-tracking branch 'origin/master' into batch_perplexity Gary Linscott 2023-04-13 08:13:09 -0700