Commit Graph

  • 230018d11c ~7% faster Q5_1 AVX2 code (#1477) Ilya Kurdyukov 2023-05-17 01:36:47 +0700
  • 09d82511d4 define default model path once, sync path with readme (#1366) András Salamon 2023-05-16 16:46:34 +0100
  • ea600071cb
    Revert "feature : add blis and other BLAS implementation support (#1502)" master-ea60007 Georgi Gerganov 2023-05-20 12:03:48 +0300
  • 07e9ace0f9
    feature : add blis and other BLAS implementation support (#1502) master-07e9ace Zenix 2023-05-20 18:02:48 +0900
  • 79f2f73c39
    setting-openblas-threads.cpp Akrati Singh 2023-05-20 14:23:48 +0530
  • 46f01a2855
    Fix typo in INTEGER Zenix 2023-05-20 17:27:56 +0900
  • 417302b226 Merge remote-tracking branch 'occam/opencl-dev' into concedo_experimental Concedo 2023-05-20 16:16:48 +0800
  • bd1aa7212c wip2 Concedo 2023-05-20 16:15:06 +0800
  • d6f6b71478 wip Concedo 2023-05-20 16:08:54 +0800
  • ec2e10c444
    llama : add llama_init_backend() API (close #1527) master-ec2e10c Georgi Gerganov 2023-05-20 11:06:11 +0300
  • a0cfed1e30 still merging in process Concedo 2023-05-20 15:58:33 +0800
  • d2c59b8ba4
    Fix for mingw (#1462) master-d2c59b8 DannyDaemonic 2023-05-20 00:40:02 -0700
  • 503db28849
    llama : fix name shadowing and C4146 (#1526) Maxime 2023-05-20 09:22:37 +0200
  • a1cdd29cd2 ggml : rms_norm in chunks chunks Georgi Gerganov 2023-05-17 19:49:04 +0300
  • 5a317898e8 ggml : process mul mat rows in chunks Georgi Gerganov 2023-05-17 19:46:09 +0300
  • 8a203f9fa1 llama : fix compile warnings in llama_set_state_data() master-8a203f9 Georgi Gerganov 2023-05-20 10:14:31 +0300
  • 4fd3e29297 ggml : fix scalar implementation of Q4_1 dot Georgi Gerganov 2023-05-20 10:13:19 +0300
  • a8958f6b76 merging, do not use Concedo 2023-05-20 15:12:31 +0800
  • fb638fa817 Merge remote-tracking branch 'origin/master' into opencl-dev 0cc4m 2023-05-20 07:55:02 +0200
  • 02914698f0 Update Q4_0, Q4_1 and Q8_0 to use half instead of float 0cc4m 2023-05-20 07:45:56 +0200
  • 0deb730749 This is a new file Akrati Singh 2023-05-20 11:12:17 +0530
  • 285f8f990b Explicitely set CLBlast GEMM type 0cc4m 2023-05-20 07:26:38 +0200
  • 4e86a07e57 wip cleanup before big merge Concedo 2023-05-20 12:48:28 +0800
  • e78a971859
    Add llama_init_ggml c api. Yaohui Liu 2023-05-20 11:42:28 +0800
  • 010b2753d9 Merge commit '6986c7835adc13ba3f9d933b95671bb1f3984dc6' into concedo_experimental Concedo 2023-05-20 11:30:51 +0800
  • 1225fab2ec fix f16 format detection in neox Concedo 2023-05-20 11:26:50 +0800
  • 3c6bdad892
    更新 merge.py FNsi 2023-05-20 10:47:41 +0800
  • f6b7767f3f fix: version detection for BLA_SIZEOF_INTEGER, recover min version of cmake zenix 2023-05-20 10:09:47 +0900
  • ee72eafdb9 feature: allow all BLA_VENDOR to be assigned in cmake arguments. align with whisper.cpp pr 927 zenix 2023-05-19 16:04:30 +0900
  • 0926278434 feature: add blis support zenix 2023-05-17 19:33:57 +0900
  • 360c365eb6
    Update merge.py FNsi 2023-05-20 08:27:37 +0800
  • 8573e491a1
    Add files via upload FNsi 2023-05-20 08:18:05 +0800
  • 3e9227051f
    Code style Maxime 2023-05-20 00:41:28 +0200
  • 733b566bac some corrections and added as cmake option FSSRepo 2023-05-19 15:11:14 -0600
  • 6df8e93234
    update Q formats Henri Vasserman 2023-05-19 23:52:35 +0300
  • 057c9b7dc8
    Merge 'origin/master' into clfixes Henri Vasserman 2023-05-19 23:46:18 +0300
  • 95dc4d7270
    Merge 'origin/master' into steering steering Henri Vasserman 2023-05-19 23:19:57 +0300
  • 78b1d8351f Add OpenCL compile options 0cc4m 2023-05-19 21:18:57 +0200
  • 2d5db48371
    ggml : use F16 instead of F32 in Q4_0, Q4_1, Q8_0 (#1508) master-2d5db48 Georgi Gerganov 2023-05-19 22:17:18 +0300
  • c6d82555a2
    readme : update performance table + hot topics Georgi Gerganov 2023-05-19 22:10:05 +0300
  • 26f4337a37
    Update llama-util.h Maxime 2023-05-19 20:35:32 +0200
  • 5e53d0c9a7
    Update llama-util.h Maxime 2023-05-19 20:35:28 +0200
  • 6986c7835a
    tests : add missing header master-6986c78 Georgi Gerganov 2023-05-19 21:17:28 +0300
  • 943e6081cc
    examples : add persistent chat (#1495) Evan Jones 2023-05-19 13:39:51 -0400
  • c7c63264e5
    examples : fix whitespace Georgi Gerganov 2023-05-19 20:38:14 +0300
  • 7694b52b9a
    main : make reverse prompt option act as a stop token in non-interactive mode (#1032) Jason McCartney 2023-05-19 10:24:59 -0700
  • 79e3efb0e9
    readme : adds WizardLM to the list of supported models (#1485) David Kennedy 2023-05-19 13:16:30 -0400
  • 4b7e245adf
    minor : fix compile warnings Georgi Gerganov 2023-05-19 20:14:51 +0300
  • 311bd6d294 Suppress redefinition warning for NOMINMAX on mingw. In my installation, this macro is already defined by /usr/lib/gcc/x86_64-w64-mingw32/11/include/c++/x86_64-w64-mingw32/bits/os_defines.h:45. Tom 7 2023-05-19 12:52:16 -0400
  • e12f3fb884 Fix bug in main.cpp where penalize_nl=false has no effect. It modifies the underlying logits array, but at this point we are already working on the candidates copy. Tom 7 2023-05-19 12:50:07 -0400
  • 08a330a136
    add cmake target for baby-llama-text xaedes 2023-05-19 18:41:26 +0200
  • 332003584e
    sample with non-greedy sampling parameters at the end of training xaedes 2023-05-19 18:41:06 +0200
  • e19ead6e3f
    print used memory before and after optimization xaedes 2023-05-19 18:40:20 +0200
  • da86a1d736
    fix cross entropy loss xaedes 2023-05-19 18:39:38 +0200
  • 09b304d015
    remove duplicate include xaedes 2023-05-19 18:36:05 +0200
  • 37f5b76df1
    ggml fixes to support backward pass on inplace operations xaedes 2023-05-19 18:35:40 +0200
  • 44d83558bc
    use different arguments for input and output checkpoint xaedes 2023-05-19 18:34:18 +0200
  • 484f6e9438
    llama: initialize f16 tables in quantize c api. Yaohui Liu 2023-05-20 00:32:08 +0800
  • d8b0666429
    initialize rng with srand xaedes 2023-05-19 18:29:47 +0200
  • a44349753a ggml : fix AVX dot products Georgi Gerganov 2023-05-19 19:13:09 +0300
  • 8b7132972d cuda : update Q4 and Q8 dequantize kernels Georgi Gerganov 2023-05-19 19:00:44 +0300
  • 3094f64241
    llama : bump LLAMA_FILE_VERSION to 3 Georgi Gerganov 2023-05-19 18:53:13 +0300
  • d4d037d995 Fix if macros not using defined when required yamashi 2023-05-19 17:47:55 +0200
  • a4d7e000ff Fix name shadowing and C4146 yamashi 2023-05-19 17:35:22 +0200
  • da3d60f154
    turning off Henri Vasserman 2023-05-19 17:24:43 +0300
  • 5c9b45c204
    Fix a very noobish C mistake Henri Vasserman 2023-05-19 16:44:32 +0300
  • 25fe1c3815
    use inplace functions where possible xaedes 2023-05-19 14:53:21 +0200
  • 35dbc8d799
    wrap all CL calls in checks. Henri Vasserman 2023-05-19 13:08:57 +0300
  • 24d5ddf67c fixup! GPU weights not in RAM, direct loading with cuFile JohannesGaessler 2023-05-19 10:13:20 +0200
  • 7df9ab9687
    clean up Henri Vasserman 2023-05-19 01:47:26 +0300
  • 7f59af52a9
    Steer with inpSA instead of with inpL Laura 2023-05-18 23:47:10 +0200
  • 772e3fbe12
    add packed just in case Henri Vasserman 2023-05-19 01:12:57 +0300
  • 1bfe5a9886 fixup! GPU weights not in RAM, direct loading with cuFile JohannesGaessler 2023-05-18 23:57:13 +0200
  • 558c672c93
    Merge 'origin/master' into clfixes Henri Vasserman 2023-05-19 00:36:03 +0300
  • 962e2a9cd9
    Added another check to find a GPU. Henri Vasserman 2023-05-19 00:35:46 +0300
  • fa1a29f36f GPU weights not in RAM, direct loading with cuFile JohannesGaessler 2023-05-17 16:35:50 +0200
  • 2365a2a970 CUDA kernel for ggml_mul, norms in VRAM JohannesGaessler 2023-05-16 15:20:36 +0200
  • de65783ba2 Broadcasting for ggml_mul JohannesGaessler 2023-05-16 09:26:03 +0200
  • 5ea4339273
    make kv_f16 the default for api users (#1517) master-5ea4339 Erik Scholz 2023-05-18 19:31:01 +0200
  • ee9654138a
    Fixes #1511 lambda issue for w64devkit (mingw) (#1513) master-ee96541 DannyDaemonic 2023-05-18 10:30:40 -0700
  • d71a8394f7
    make kv_f16 the default for api users Green Sky 2023-05-18 14:55:08 +0200
  • 9a565fc9a1
    Update README.md ajasibley 2023-05-18 12:39:44 +0000
  • 88b988e74b
    Update README.md ajasibley 2023-05-18 12:38:18 +0000
  • 2dd8d16948
    Update README.md ajasibley 2023-05-18 12:38:06 +0000
  • 3c0ab1f8b8
    Update README.md ajasibley 2023-05-18 12:37:35 +0000
  • 954dccfd22
    Update README.md ajasibley 2023-05-18 12:37:05 +0000
  • 4905d355b5 Updated shell nix to install cosmo if not already installed. Added README and cleaned up repo. Aja Sibley 2023-05-18 12:36:09 +0000
  • 036c3675df Try to salvage the lambda function Danny Daemonic 2023-05-18 05:03:39 -0700
  • 649098737d Fix for w64devkit and mingw Danny Daemonic 2023-05-18 04:39:17 -0700
  • f5e1fe46e1 up ver Concedo 2023-05-18 17:15:03 +0800
  • f65bae760a Merge remote-tracking branch 'occam/opencl-dev' into concedo_experimental Concedo 2023-05-18 15:52:35 +0800
  • dae9a14b50 disable CL f16 for now until it's sorted out Concedo 2023-05-18 14:23:56 +0800
  • b73c437e83 Fix convert_row_f16 kernel issue 0cc4m 2023-05-18 08:05:19 +0200
  • 0df55da4ca Deduplicate dequant kernels 0cc4m 2023-05-18 07:35:40 +0200
  • 119bddaa32 authenticity is all you need Barton Rhodes 2023-05-18 03:22:13 +0000
  • 26e842a342 Added error log for cosmo. Aja Sibley 2023-05-18 01:07:10 +0000
  • 985ab154ec Added openssl to nix for cosmonic. Aja Sibley 2023-05-18 00:07:39 +0000
  • dc271c52ed
    Remove unused n_parts parameter (#1509) master-dc271c5 Stephan Walter 2023-05-17 22:12:01 +0000
  • c3ad27c5df Updated comments. Aja Sibley 2023-05-17 21:19:50 +0000
  • 933bb643a7 Fixed the vespa overlay. Aja Sibley 2023-05-17 21:18:23 +0000