Commit Graph

  • 4e1f81d32f
    implement backward pass for ggml_get_rows and for new operation ggml_get_rows_back xaedes 2023-04-24 22:49:34 +0200
  • 488decfdc5
    implement backward pass of ggml_rope and ggml_rope_back xaedes 2023-04-24 19:06:16 +0200
  • 36d8a051d4
    remove already resolved TODO xaedes 2023-04-24 05:54:51 +0200
  • b908007471
    norm & rms_norm can not be threaded: xaedes 2023-04-24 04:13:33 +0200
  • b164343529
    implement 5 of 6 missing backward pass operations used by llama xaedes 2023-05-01 02:20:14 +0200
  • 73ac18d856
    implement 8 of 14 missing backward pass operations used by llama xaedes 2023-05-01 02:39:54 +0200
  • c9fdebc02c Hotfix prompt caching introduced in #1169, fixes #1257 Ivan Stepanov 2023-05-01 03:34:34 +0300
  • 635327c355 Bump version Ivan Stepanov 2023-04-30 23:33:18 +0300
  • dd88594585 Save prompt after initial prompt eval (fixes #1257) Ivan Stepanov 2023-04-30 23:16:41 +0300
  • 7ff0dcd320
    ggml : fix UB (int << 31) master-7ff0dcd Georgi Gerganov 2023-04-30 22:28:51 +0300
  • 6f79699286
    build: add armv{6,7,8} support to cmake (#1251) master-6f79699 Pavol Rusnak 2023-04-30 20:48:38 +0200
  • a5d30b1f53
    common : better default number of threads (#934) master-a5d30b1 jon-chuang 2023-04-30 14:41:35 -0400
  • 76a884920a
    ggml : add CLBlast q5_0, q5_1, q8_0 dequant kernels (#1225) master-76a8849 0cc4m 2023-04-30 20:34:52 +0200
  • 6bc4400e67
    ggml : add Q5 WASM SIMD + GGML_FTYPE master-6bc4400 Georgi Gerganov 2023-04-30 19:07:00 +0300
  • 25201233ca fixed unbantokens not following EOS Concedo 2023-05-01 00:02:45 +0800
  • 294a5d00b1 Merge remote-tracking branch 'occam/clblast-further-dequant-kernels' into concedo_experimental Concedo 2023-04-30 23:56:24 +0800
  • 3b5df18dbb temp fix for compilation issues on OSX (M1) Concedo 2023-04-30 23:48:46 +0800
  • c73def129a
    Merge 'origin/master' into hipblas Henri Vasserman 2023-04-30 18:40:42 +0300
  • 979010cdba minor jon-chuang 2023-04-30 21:02:55 +0800
  • e112522aa9 Merge branch 'master' of https://github.com/ggerganov/llama.cpp into jon/tall-and-skinny-matmul jon-chuang 2023-04-30 20:57:32 +0800
  • 470cc4c5d1 minor jon-chuang 2023-04-30 20:56:46 +0800
  • f0d70f147d
    Various fixes to mat_mul benchmark (#1253) master-f0d70f1 Stephan Walter 2023-04-30 12:32:37 +0000
  • 363f72de85 Rename benchmark-q4_0-matmult.cpp -> benchmark-matmult.cpp Stephan Walter 2023-04-30 13:55:42 +0200
  • f5a5cc9e6a
    build: add armv{6,7,8} support to cmake Pavol Rusnak 2023-04-30 11:13:03 +0200
  • fb469ed972 fma compile only jon-chuang 2023-04-30 18:24:56 +0800
  • 876dcec301 Various fixes to mat_mul benchmark Stephan Walter 2023-04-30 12:19:55 +0200
  • 74a8db7ade Merge branch 'master' of https://github.com/ggerganov/llama.cpp into jon/tall-and-skinny-matmul jon-chuang 2023-04-30 18:19:48 +0800
  • 78761b10b6 minor jon-chuang 2023-04-30 18:15:53 +0800
  • 2fbc90f25e minor jon-chuang 2023-04-30 18:14:41 +0800
  • 496d291d67 Merge branch 'master' of https://github.com/ggerganov/llama.cpp into jon/use-hardware-cores jon-chuang 2023-04-30 18:13:49 +0800
  • f1c19d8884 remove jon-chuang 2023-04-30 18:11:52 +0800
  • 710c4bbdbf
    Apply suggestions from code review jon-chuang 2023-04-30 18:10:08 +0800
  • e69c924ad1 Use two memcpy calls for q5_0 buffer transfer 0cc4m 2023-04-30 10:44:48 +0200
  • fdd21d0eba add missing include Concedo 2023-04-30 16:15:11 +0800
  • 3e5aa8a1c4
    ggml : fix labels for GGML_OP_ALIBI master-3e5aa8a Georgi Gerganov 2023-04-30 10:25:46 +0300
  • b3315459c7 pilled the new dequants for clblast, fixed some ooms Concedo 2023-04-30 14:15:44 +0800
  • bd5e7409f3
    Fixed incorrect example of quantize in README.md D3faIt 2023-04-30 06:43:46 +0200
  • 0061b90ec6 Merge branch 'master' into concedo_experimental Concedo 2023-04-30 10:35:02 +0800
  • 8e739a091f Compress llama state Ivan Stepanov 2023-04-30 03:16:46 +0300
  • 476f46f7cc cuBLAS: do not use pinned memory if env variable GGML_CUDA_NO_PINNED is set Slaren 2023-04-29 22:25:00 +0200
  • c3ca7a5f05
    ggml : fix 32-bit ARM NEON master-c3ca7a5 Georgi Gerganov 2023-04-29 21:34:23 +0300
  • e859ebbb48 ggml: use __restrict instead of restrict on MS compiler to prevent compiler error on VS2017 and VS2019. Helmut 2023-04-29 20:31:08 +0200
  • e8c051611a
    ggml : use vzip instead of vuzp for consistency master-e8c0516 Georgi Gerganov 2023-04-29 21:12:56 +0300
  • 0b5a935099
    ggml : fix visibility and unused warnings master-0b5a935 Georgi Gerganov 2023-04-29 19:28:36 +0300
  • 08e539d5e4 cuBLAS: fall back to pageable memory if pinned alloc fails Slaren 2023-04-29 18:25:46 +0200
  • ec728e44d7
    ggml : fix #if for f32_f32 mul_mat (CLBlast) (#1229) master-ec728e4 Georgi Gerganov 2023-04-29 18:43:42 +0300
  • 214b6a3570
    ggml : adjust mul_mat_f16 work memory (#1226) master-214b6a3 Georgi Gerganov 2023-04-29 18:43:28 +0300
  • db0604121a Handle signals properly on Windows Danny Daemonic 2023-04-22 04:01:02 -0700
  • f51988952a
    ggml : fix #if for f32_f32 mul_mat (CLBlast) Georgi Gerganov 2023-04-29 14:45:12 +0300
  • f149114395 up ver Concedo 2023-04-29 19:42:21 +0800
  • 7afad2b9b5 integrated the new samplers Concedo 2023-04-29 19:41:41 +0800
  • 658c686e5a
    ggml : add asserts to guard for incorrect wsize Georgi Gerganov 2023-04-29 14:26:36 +0300
  • 0ffcd89870
    ggml : reduce memory buffer for F16 mul_mat when not using cuBLAS Georgi Gerganov 2023-04-29 12:41:54 +0300
  • 150e135858
    llama : minor - remove explicity int64_t cast Georgi Gerganov 2023-04-29 11:45:23 +0300
  • 305eb5afd5
    build : fix reference to old llama_util.h master-305eb5a Georgi Gerganov 2023-04-29 13:53:12 +0300
  • 84ca9c2ecf
    examples : fix save-load-state + rename llama-util.h Georgi Gerganov 2023-04-29 13:48:11 +0300
  • da0c34b028 Merge branch 'master' into concedo_experimental Concedo 2023-04-29 18:27:06 +0800
  • fe0e4de8e8 fixed a regression where a bad model was giving valid logits after library changes. now we run the eval through the model twice and compare logits. if they give the same logits for different inputs, model is broken Concedo 2023-04-29 18:25:17 +0800
  • 369d903eda Move cl kernels into ggml-opencl.c 0cc4m 2023-04-29 10:48:52 +0200
  • d6be497ef6 Fix q8_0 dequant kernel 0cc4m 2023-04-29 10:37:58 +0200
  • 1560c10f24 Work around q5_0 OpenCL issue 0cc4m 2023-04-29 10:26:58 +0200
  • 9439da6f95 Implement q5_0, q5_1 and q8_0 0cc4m 2023-04-29 07:43:15 +0200
  • d8ea75e952
    Merge 'origin/master' into hipblas Henri Vasserman 2023-04-29 11:25:51 +0300
  • 334637e43e
    common : change default parameters to pre-#1126 (#1223) master-334637e Georgi Gerganov 2023-04-29 09:51:06 +0300
  • 17d3938955
    common : change default parameters to pre-#1126 Georgi Gerganov 2023-04-29 09:22:34 +0300
  • dd7eff57d8
    llama : new sampling algorithms (#1126) master-dd7eff5 Ivan Stepanov 2023-04-29 08:34:41 +0300
  • 5aa185f3f7 remove preallocation Concedo 2023-04-29 12:32:37 +0800
  • bb282a4ecf reinstated the q4_3 format, for backwards compatibility. Concedo 2023-04-29 11:42:04 +0800
  • 0fc1772a8f Merge branch 'master' into concedo_experimental Concedo 2023-04-29 11:14:05 +0800
  • 67ee2b93a7 removed bad import. Concedo 2023-04-29 09:59:16 +0800
  • d67f481144 Speedup dequantize_block_q4_0() Ivan Komarov 2023-04-29 00:46:25 +0300
  • 70ce50d377
    Merge 5bac24d2f7 into 7fc50c051a TheNotary 2023-04-29 08:23:11 +0800
  • 7fc50c051a
    cuBLAS: use host pinned memory and dequantize while copying (#1207) master-7fc50c0 slaren 2023-04-29 02:04:18 +0200
  • 38a021fafe fix rebase Slaren 2023-04-29 01:55:50 +0200
  • 3cf2247d37 cuBLAS: also pin kv cache Slaren 2023-04-28 00:48:01 +0200
  • d5d6a8083a cuBLAS: improve ggml_compute_forward_mul_mat_f16_f32 with pinned memory Slaren 2023-04-27 23:27:59 +0200
  • 2dd6deeb49 cuBLAS: use host pinned memory Slaren 2023-04-27 21:51:43 +0200
  • d3fd04e92e cuBLAS: dequantize simultaneously while copying memory Slaren 2023-04-27 20:16:32 +0200
  • b1ee8f59b4
    cuBLAS: non-contiguous tensor support (#1215) master-b1ee8f5 Henri Vasserman 2023-04-29 02:31:56 +0300
  • 36d19a603b
    Remove Q4_3 which is no better than Q5 (#1218) master-36d19a6 Stephan Walter 2023-04-28 23:10:43 +0000
  • d194586f65
    Merge 'origin/master' into hipblas Henri Vasserman 2023-04-28 23:03:52 +0300
  • f571806da7 Windows test fix Ivan Stepanov 2023-04-28 22:12:25 +0300
  • f1ec8b422d Review suggestions: comments for removed enum values Stephan Walter 2023-04-28 20:44:13 +0200
  • 924309a248 Remove Q4_3 which is no better than Q5 Stephan Walter 2023-04-28 19:43:20 +0200
  • 7f15c5c477
    readme : update hot topics Georgi Gerganov 2023-04-28 21:32:52 +0300
  • 55390bcaf2
    ggml : sync ggml (ggml_alibi) master-55390bc Georgi Gerganov 2023-04-28 20:37:43 +0300
  • 4ab7bb77c0 Windows build fix Ivan Stepanov 2023-04-28 20:42:44 +0300
  • 3bf3a968b6 Tests Ivan Stepanov 2023-04-28 20:36:53 +0300
  • 416f49182a Save and load example adjust Ivan Stepanov 2023-04-28 20:19:17 +0300
  • 6c4c88d54f Use C++11, clarify llama API documentation, rename Mirostat parameters to --mirostat_lr and --mirostat_ent, add temperature sampling for Mirostat, simplify Mirostat sampling API parameters (removed N and *k) Ivan Stepanov 2023-04-28 19:53:24 +0300
  • 61f822f63b Added --logit-bias and --no-penalize-nl, removed std::span Ivan Stepanov 2023-04-28 03:12:49 +0300
  • f01c67fe55 mirostat Ivan Stepanov 2023-04-22 21:23:10 +0300
  • 9b3b07cc5c Sample interface, new samplers. Ivan Stepanov 2023-04-22 14:31:08 +0300
  • 5fba3c016b
    examples : add Jeopardy example (#1168) CRD716 2023-04-28 11:13:33 -0500
  • d8b4b15722
    Merge branch 'ggerganov:master' into master Wen Shi 2023-04-29 00:00:56 +0800
  • 1481a9cf25
    llama : add session file format and saved sessions in main (#1169) master-1481a9c Evan Jones 2023-04-28 11:59:37 -0400
  • ef7bfbad54
    Merge pull request #1 from shiqinwen/shiqinwen-correct-readme-quantize-param Wen Shi 2023-04-28 23:58:52 +0800
  • ef551af6c1
    Correct the parameters of type given. Wen Shi 2023-04-28 23:58:30 +0800
  • f19ee3b2ec
    now then? Henri Vasserman 2023-04-28 18:50:07 +0300
  • 759510534c
    more fixes, now OpenBLAS and CLBlast build too Henri Vasserman 2023-04-28 18:31:56 +0300