Commit Graph

  • b5c5c8e2b9
    Merge branch 'ggerganov:master' into master WangHaoranRobin 2023-06-26 11:56:01 -0700
  • c824d2e368
    ggml : avoid conv 2d kernel round up master-c824d2e Georgi Gerganov 2023-06-26 21:03:59 +0300
  • b853d45601
    ggml : add NUMA support (#1556) master-b853d45 zrm 2023-06-26 13:57:59 -0400
  • 9aec2b74bd
    server : utilize numa parameter Georgi Gerganov 2023-06-26 20:53:55 +0300
  • 81a40e9d61
    ggml : fix handling of ops with n_threads > n_tasks > 1 Georgi Gerganov 2023-06-26 20:50:50 +0300
  • 4a555b4539
    ggml : style / formatting Georgi Gerganov 2023-06-26 20:37:55 +0300
  • 875a1e111e
    llama : avoid ggml include in llama-util.h Georgi Gerganov 2023-06-26 20:27:24 +0300
  • 0fe4b00de2
    llama : allow to initialize backend with NUMA support Georgi Gerganov 2023-06-26 20:24:17 +0300
  • 8f98035e0a
    Merge branch 'master' into HEAD Georgi Gerganov 2023-06-26 20:17:39 +0300
  • 9225baef71
    k-quants : fix indentation master-9225bae Georgi Gerganov 2023-06-26 20:10:52 +0300
  • f41f09a50a add api_like_OAI.py jwj7140 2023-06-27 02:05:46 +0900
  • a84ab1da8d
    tests : fix quantize perf (#1990) master-a84ab1d katsu560 2023-06-27 01:47:02 +0900
  • 5743ca8092
    k-quants : add AVX support to dot functions (#1916) master-5743ca8 katsu560 2023-06-27 01:46:07 +0900
  • 412c60e473
    readme : add link to new k-quants for visibility Georgi Gerganov 2023-06-26 19:45:09 +0300
  • 6769e944c7
    k-quants : support for super-block size of 64 (#2001) master-6769e94 Kawrakow 2023-06-26 19:43:07 +0300
  • 75031d5c23 Add verbose flag to control console output about model information on load. grahameth 2023-06-26 18:22:39 +0200
  • 42bf841232 Increase GGML_MAX_NAME David Yang 2023-06-26 11:26:48 +0800
  • 3ee817a2cc doc- fix typo gustrd 2023-06-26 12:57:38 -0300
  • 46cb52bfa4 Merge branch 'master' into master-androidClblast gustrd 2023-06-26 12:55:27 -0300
  • de2ddd86f3 doc - LD_LIBRARY_PATH complement for some Android devices when building with CLBlast inside Termux. gustrd 2023-06-26 12:53:22 -0300
  • cbebf61ca7
    Fix assert when free invalid cuda pointer (#2005) master-cbebf61 Howard Su 2023-06-26 23:15:47 +0800
  • ef15653a3b Fix assert when free invalid cuda pointer Howard Su 2023-06-26 22:42:35 +0800
  • 5fd83379ff k_quants: fixed issue caused by merging with master Iwan Kawrakow 2023-06-26 13:46:24 +0300
  • 53e81ca289 k_quants: 10% faster ARM_NEON Q5_K dot product Iwan Kawrakow 2023-06-24 18:32:38 +0300
  • 2da3a59708 k_quants: AVX2 implementation for new 64-weight Q5_K Iwan Kawrakow 2023-06-24 18:17:35 +0300
  • ccf4901334 k_quants: change Q5_K to be type 0 when QK_K = 64 Iwan Kawrakow 2023-06-24 17:39:25 +0300
  • 4f61506929 k_quants: forgot to add the Metal changes in last commit Iwan Kawrakow 2023-06-24 15:47:27 +0300
  • ce19b965f0 k_quants: switch Q4_K to 4-bit scales when QK_K = 64 Iwan Kawrakow 2023-06-24 15:44:23 +0300
  • aeefd4e781 k_quants: swicth Q3_K to 4-bit scales when QK_K = 64 Iwan Kawrakow 2023-06-24 09:57:50 +0300
  • 88412a1aa0 Simplify via lambda Iwan Kawrakow 2023-06-23 17:50:53 +0300
  • 333ffcc5ba Fixed bug in q4_K quantization added with the 64-block addition Iwan Kawrakow 2023-06-23 17:46:11 +0300
  • 558a19427b k_quants: correctly define QK_K in llama.cpp Iwan Kawrakow 2023-06-23 14:35:21 +0300
  • 8b98d01e31 k_quants: call them _K, not _k, also on Metal Iwan Kawrakow 2023-06-23 14:16:12 +0300
  • 285eeb1531 k_quants: WIP super-blocks with 64 weights Iwan Kawrakow 2023-06-23 14:11:47 +0300
  • ff83e32c6a k_quants: WIP super-blocks with 64 weights Iwan Kawrakow 2023-06-23 13:34:02 +0300
  • 6081a65527 k_quants: WIP super-blocks with 64 weights Iwan Kawrakow 2023-06-23 13:04:30 +0300
  • 167a0bbe34 k_quants: WIP super-blocks with 64 weights Iwan Kawrakow 2023-06-23 12:34:07 +0300
  • e1bbcfc5cb k_quants: WIP super-blocks with 64 weights Iwan Kawrakow 2023-06-23 11:32:14 +0300
  • fae24afd01 k_quants: WIP super-blocks with 64 weights Iwan Kawrakow 2023-06-23 10:11:35 +0300
  • d92c5a9e29 k_quants: WIP super-blocks with 64 weights Iwan Kawrakow 2023-06-23 08:43:04 +0300
  • 2ff543c147 k_quants: WIP super-blocks with 64 weights Iwan Kawrakow 2023-06-22 23:15:41 +0300
  • 9d27d8d0ea k_quants: WIP super-blocks with 64 weights Iwan Kawrakow 2023-06-22 20:04:18 +0300
  • 2b2a13c4f9 k_quants: WIP super-blocks with 64 weights Iwan Kawrakow 2023-06-22 19:41:28 +0300
  • 80c75fe821 k_quants: WIP super-blocks with 64 weights Iwan Kawrakow 2023-06-22 18:16:08 +0300
  • cda47a6b2f k_quants: WIP super-blocks with 64 weights Iwan Kawrakow 2023-06-22 16:31:51 +0300
  • 03f30c8eca k_quants: WIP super-blocks with 64 weights Iwan Kawrakow 2023-06-22 15:10:41 +0300
  • 3bd9ae79d8 k_quants: WIP super-blocks with 64 weights Iwan Kawrakow 2023-06-22 13:31:54 +0300
  • 460dd841b1 k_quants: WIP super-blocks with 64 weights Iwan Kawrakow 2023-06-22 12:51:18 +0300
  • 41e46ec1c2 k_quants: WIP super-blocks with 64 weights Iwan Kawrakow 2023-06-22 11:39:00 +0300
  • 5aae4b8d4f k_quants: WIP super-blocks with 64 weights Iwan Kawrakow 2023-06-22 09:40:33 +0300
  • c6c35366bf k_quants: WIP super-blocks with 64 weights Iwan Kawrakow 2023-06-22 00:39:21 +0300
  • bcf8c5c384 k_quants: WIP super-blocks with 64 weights Iwan Kawrakow 2023-06-21 18:28:40 +0300
  • 2b2ab31a89 k_quants: WIP super-blocks with 64 weights Iwan Kawrakow 2023-06-21 16:53:06 +0300
  • aebd5471e9 k_quants: WIP super-blocks with 64 weights Iwan Kawrakow 2023-06-21 15:20:18 +0300
  • 1f6195c2f2 k_quants: WIP super-blocks with 64 weights Iwan Kawrakow 2023-06-21 14:31:21 +0300
  • 9fe2a2b1db k_quants: WIP super-blocks with 64 weights Iwan Kawrakow 2023-06-21 13:41:30 +0300
  • d2f12ac354 k_quants: WIP super-blocks with 64 weights Iwan Kawrakow 2023-06-21 12:43:44 +0300
  • 1fdf9d1131 desc Concedo 2023-06-26 16:58:59 +0800
  • e4c9aea840 Merge branch 'master' into concedo_experimental Concedo 2023-06-26 10:35:47 +0800
  • 77edee7d9b
    Merge pull request #7 from WangHaoranRobin/robin_fork_master WangHaoranRobin 2023-06-25 16:33:38 -0700
  • 13f5d697ce
    Merge branch 'master' into robin_fork_master WangHaoranRobin 2023-06-25 16:33:31 -0700
  • c9e6642cf7 server: handle probs output when temp=0; handle final response probs output Wang Haoran(Robin) 2023-06-25 16:29:34 -0700
  • bd6550bd8b
    Merge pull request #6 from WangHaoranRobin/robin_fork_master WangHaoranRobin 2023-06-25 14:16:35 -0700
  • e815b69579 server: remove n_probs upper limit of 5 Wang Haoran(Robin) 2023-06-25 14:15:14 -0700
  • c1e5c8345e
    Merge 'origin/master' into hipblas Henri Vasserman 2023-06-25 21:40:05 +0300
  • 0766ee3de4 Clean up compiler warnings in train-text David Yang 2023-06-25 11:44:37 +0800
  • af058cf820
    Merge branch 'ggerganov:master' into master WangHaoranRobin 2023-06-25 08:51:59 -0700
  • 40340d82af Add MiniGPT-4 example ningshanwutuobang 2023-06-25 23:42:26 +0800
  • 78fafcaf10
    ggml : do not use _GNU_SOURCE gratuitously avoid-gnu-source Georgi Gerganov 2023-06-25 16:41:53 +0300
  • 447ccbe8c3
    readme : add new roadmap + manifesto Georgi Gerganov 2023-06-25 16:08:12 +0300
  • bd34cdde38
    ggml : sync latest ggml (custom operators) master-bd34cdd Georgi Gerganov 2023-06-25 14:25:08 +0300
  • d2034ced7b Merge branch 'master' into concedo_experimental Concedo 2023-06-25 17:01:15 +0800
  • c2a08f87b8
    fix server sampling: top k sampler first (#1977) master-c2a08f8 anon998 2023-06-25 08:48:36 +0000
  • 35a603161a
    Merge 'origin/master' into hipblas Henri Vasserman 2023-06-25 10:57:48 +0300
  • 28f8f59191 avoid the global state katsu560 2023-06-25 16:55:40 +0900
  • 9eacc38d0d fix test quantize perf katsu560 2023-06-25 16:17:27 +0900
  • 3a88e8e803 fix test quantize perf katsu560 2023-06-25 15:59:39 +0900
  • 3d30586c31 k_quants : apply review comments katsu560 2023-06-25 15:27:17 +0900
  • 66a2555ba6
    readme : add Azure CI discussion link Georgi Gerganov 2023-06-25 09:07:03 +0300
  • e65ca7e14a
    zig : upgrade build system support (#1981) sjinzh 2023-06-25 13:45:44 +0800
  • ed6c9111a2
    zig : add new line at the end of the file Georgi Gerganov 2023-06-25 08:44:47 +0300
  • afee3cfc1f draft for #1776 making bos and eos available for user input instead of hard coded Hashem Alsaket 2023-06-24 18:50:08 -0500
  • 6cb62ca287 add cmake build for embd-input ningshanwutuobang 2023-06-25 06:17:30 +0800
  • beca5a6185 add cmake build for embd-input ningshanwutuobang 2023-06-25 06:15:13 +0800
  • 9b03f85953 Merge branch 'master' into embd_inp ningshanwutuobang 2023-06-25 05:52:34 +0800
  • afa8dd51e4 CUDA GPU acceleration for LoRAs + f16 models JohannesGaessler 2023-06-22 21:28:36 +0200
  • 5ec8dd5a3c
    #1869 Fix null reference errors when training from scratch with CUDA (#1907) master-5ec8dd5 Robyn 2023-06-25 04:10:29 +1000
  • 9d866118c4 refactor the interface and fixed the styles ningshanwutuobang 2023-06-25 01:21:54 +0800
  • 65bdd52a86
    tests : sync test-grad0 from ggml master-65bdd52 Georgi Gerganov 2023-06-24 19:40:18 +0300
  • 23b516b053
    Merge branch 'ggerganov:master' into master WangHaoranRobin 2023-06-24 07:55:23 -0700
  • 7f7046ea01
    Merge pull request #5 from WangHaoranRobin/robin_fork_master WangHaoranRobin 2023-06-24 07:55:15 -0700
  • 02c96a4cbb server: remove trailling white space Wang Haoran(Robin) 2023-06-24 07:54:26 -0700
  • fdd1860911
    flake : fix ggml-metal.metal path and run nixfmt (#1974) Rowan Hart 2023-06-24 04:07:08 -0700
  • c943d823c1
    convert : fix invalid params in write_vocab_only (#1975) AN Long 2023-06-24 19:02:06 +0800
  • f2c754e1c3
    ggml : improve ggml_graph_dump_dot, add ggml_format_name (#1978) master-f2c754e slaren 2023-06-24 12:57:18 +0200
  • 11da1a85cd
    readme : fix whitespaces Georgi Gerganov 2023-06-24 13:38:18 +0300
  • 235b610d65
    readme : fixed termux instructions (#1973) Alberto 2023-06-24 12:32:13 +0200
  • b061ba9e2a
    llama : fix top-p sampling to match the canonical definition (#1953) master-b061ba9 Alex Renda 2023-06-24 03:15:01 -0700
  • 527b6fba1d
    llama : make model stateless and context stateful (llama_state) (#1797) master-527b6fb Didzis Gosko 2023-06-24 11:47:58 +0300
  • c2ccd541e9
    ggml : do not dereference src0 if NULL Georgi Gerganov 2023-06-24 11:36:10 +0300