Commit Graph

  • 3839704062
    convert-new.py : minor fixes Georgi Gerganov 2023-08-17 17:16:26 +0300
  • 5d044403d3
    Merge branch 'gguf' into gguf-convert Georgi Gerganov 2023-08-17 17:04:49 +0300
  • 39362f3485
    gguf.py : pick some of the refactoring from #2644 Georgi Gerganov 2023-08-17 17:02:01 +0300
  • 5f97a48fc1 gguf : single pass for writing tensors + refactoring writer M. Yusuf Sarıgöz 2023-08-17 16:57:50 +0300
  • 673ae1a17e
    convert-new.py : convert script now works Georgi Gerganov 2023-08-17 16:52:25 +0300
  • dce07c3121 gguf : single pass for writing tensors + refactoring writer M. Yusuf Sarıgöz 2023-08-17 16:48:49 +0300
  • 8dae7ce684
    Add --cfg-negative-prompt-file option for examples (#2591) master-8dae7ce Kerfuffle 2023-08-17 07:29:44 -0600
  • d6fd53afd6
    llama.cpp : use ggml_elements() klosax 2023-08-17 15:24:35 +0200
  • 5a0a2c5685
    llama.cpp : print actual model size klosax 2023-08-17 15:18:16 +0200
  • af4960a5a5 server : attempt use valid xxd command on linux Jhen 2023-08-17 20:30:48 +0800
  • 7eaa315631
    convert-new.py : add map for skipping tensor serialization Georgi Gerganov 2023-08-17 15:40:39 +0300
  • f31e9230ad gguf : single pass for writing tensors + refactoring writer M. Yusuf Sarıgöz 2023-08-17 15:19:30 +0300
  • 580e02e11e server : always regenerate asset hpp before compile Jhen 2023-08-17 19:40:27 +0800
  • 86bc9d2750
    convert-new.py : tensor name mapping Georgi Gerganov 2023-08-17 13:15:17 +0300
  • 6c26109743 Minor formatting change. KerfuffleV2 2023-08-17 04:48:27 -0600
  • e970845383
    tests : add new ggml-vocab-llama.gguf Georgi Gerganov 2023-08-17 12:38:34 +0300
  • 7b6ae89041
    llama : fix tokenizer to use llama_char_to_byte Georgi Gerganov 2023-08-17 12:27:26 +0300
  • 0ba5d488e5
    convert-new.py : vocab-only option should work now Georgi Gerganov 2023-08-17 12:00:13 +0300
  • f9db574bbf
    convert-new.py : minor fixes Georgi Gerganov 2023-08-16 23:11:21 +0300
  • a73ccf1aa3
    llama : replace (permute + reshape + view_1d) with (view_3d) (#2538) master-a73ccf1 Georgi Gerganov 2023-08-17 10:47:09 +0300
  • ccfe9080cd
    llama : remove commented code Georgi Gerganov 2023-08-17 10:45:21 +0300
  • 7cf54e1f74
    tests : adds simple llama grammar tests (#2618) master-7cf54e1 drbh 2023-08-17 03:41:01 -0400
  • a872a2b28e
    ggml-alloc : fix discrepency between measure&eval (#2639) master-a872a2b Shouzheng Liu 2023-08-17 03:35:53 -0400
  • 42f8fe1927 examples/gguf : no need to keep q option for quantization any more M. Yusuf Sarıgöz 2023-08-17 08:56:42 +0300
  • d864596e0a Merge branch 'gguf' of https://github.com/ggerganov/llama.cpp into gguf goerch 2023-08-17 04:55:26 +0200
  • 3bade857c7 cleanup grammar at end of test drbh 2023-08-16 21:48:23 -0400
  • 1108394acd ggml-alloc: fix discrepency between measure&eval lshzh-ww 2023-08-16 21:24:58 -0400
  • 9c3866099b cleanup slaren 2023-08-17 03:22:19 +0200
  • 94218e8ade markdown: add build id slaren 2023-08-17 03:12:03 +0200
  • 569dc6f3d0 markdown: also show values that differ from the default slaren 2023-08-17 03:06:07 +0200
  • 9e05cc1d69 avoid dangling pointers in candidate cleanup drbh 2023-08-16 20:58:37 -0400
  • cac70312e3 add basic cpu and gpu info (linx/cuda only) slaren 2023-08-17 02:50:04 +0200
  • d49dc3d628 0 terminate code_points drbh 2023-08-16 20:30:26 -0400
  • 67362d9db0 add sql output slaren 2023-08-17 02:19:52 +0200
  • 314a6b5422 fix json formatting slaren 2023-08-17 00:12:40 +0200
  • 89a70f78e7 llama.cpp : fix MEM_REQ_SCRATCH0 reusing the value of n_ctx of the first call slaren 2023-08-16 22:40:53 +0200
  • 714fec06ee
    use ggml_add_cast in finetuning xaedes 2023-08-16 23:53:12 +0200
  • 9198b24e4e
    add ggml_add_cast API function xaedes 2023-08-16 23:50:46 +0200
  • 54113caf0d convert autosave invocation to useEffect staviq 2023-08-16 23:45:59 +0200
  • 57af6dd320 sync accepted #2409 fix from upstream staviq 2023-08-16 23:19:55 +0200
  • c88c2a992a
    probably lld is not required Henri Vasserman 2023-08-16 23:17:52 +0300
  • 0919a0f73d
    cmake : install ggml-meta.metal if LLAMA_METAL (#2449) master-0919a0f Kolen Cheung 2023-08-16 21:09:49 +0100
  • ed53db86c3
    metal : print error of load pipeline state (#2564) Jhen-Jie Hong 2023-08-17 04:09:03 +0800
  • f80e245d7b
    add lora finetune support on quantized base model tensors xaedes 2023-08-16 22:06:20 +0200
  • fc8ef549e5
    metal : enable ggml-alloc (#2627) master-fc8ef54 Shouzheng Liu 2023-08-16 16:08:28 -0400
  • 13a746c6b9
    Merge branch 'master' into metal-memory-alloc Georgi Gerganov 2023-08-16 23:08:03 +0300
  • bf83bff674
    metal : matrix-matrix multiplication kernel (#2615) master-bf83bff Shouzheng Liu 2023-08-16 16:07:04 -0400
  • 83a4ad7986
    remove trailing whitespace xaedes 2023-08-16 22:05:41 +0200
  • 83cb9ed4f5
    implement ggml_compute_forward_out_prod_q_f32 xaedes 2023-08-16 22:00:37 +0200
  • 79ad888768
    remove unused call to not existing llama_get_layer_from_model xaedes 2023-08-16 21:56:36 +0200
  • 1151653b15
    replace llama API functions to get model tensors by one function to get model tensor by name xaedes 2023-08-16 21:36:40 +0200
  • c40ec5c403
    llama : add hparams.ctx_train + no longer print ftype Georgi Gerganov 2023-08-16 22:05:23 +0300
  • 8be49fdf9e
    convert-new.py : add gguf key-value pairs Georgi Gerganov 2023-08-16 21:52:06 +0300
  • bbbc0ce717
    makefile rewrite Henri Vasserman 2023-08-16 21:28:54 +0300
  • 250cf83847
    convert-new.py : output gguf (WIP) Georgi Gerganov 2023-08-16 20:44:51 +0300
  • 5765f90f58 better checks for non-optimized builds slaren 2023-08-16 19:24:54 +0200
  • 5ec18934ad
    convert-new.py : pick #2427 for HF 70B support Georgi Gerganov 2023-08-16 20:16:15 +0300
  • c8ee87f141
    gguf.py : merge all files in gguf.py Georgi Gerganov 2023-08-16 19:55:49 +0300
  • 88b5769487
    gguf : deduplicate (#2629) Georgi Gerganov 2023-08-16 19:25:29 +0300
  • 795ec7070c
    examples : dedup simple Georgi Gerganov 2023-08-16 19:22:58 +0300
  • c290f3eee6
    ggml : assert when using ggml_mul with non-F32 src1 Georgi Gerganov 2023-08-16 19:19:46 +0300
  • 3de6a9aed2
    reenable LLAMA_CUDA_FORCE_DMMV Henri Vasserman 2023-08-16 18:35:16 +0300
  • 68e79cc134
    Merge 'origin/master' into hipblas Henri Vasserman 2023-08-16 18:25:14 +0300
  • f3e90f27de
    convert-llama-h5-to-gguf.py : support alt ctx param name klosax 2023-08-16 17:10:29 +0200
  • 39a2d15461
    avoid stack overflow resulting from big ggml_cgraph xaedes 2023-08-16 16:42:25 +0200
  • 0ab2507ce5
    fix names of lora tensors xaedes 2023-08-16 16:41:20 +0200
  • 6412e97427
    llama : restore the original load/save session implementation Georgi Gerganov 2023-08-16 17:35:37 +0300
  • 620275361d
    add debug prints for training memory improvements xaedes 2023-08-16 16:23:21 +0200
  • be7e564b11
    bug fixes to make finetune compile xaedes 2023-08-16 16:21:43 +0200
  • 50b1e66200
    remove const model and layer arguments in API functions for accessing model tensors xaedes 2023-08-16 16:21:02 +0200
  • 3e3396e2e5 remove n_prompt and n_gen from the matrix, use each value separately instead slaren 2023-08-16 15:45:39 +0200
  • 19e9beabb3 print warning is NDEBUG is not defined slaren 2023-08-16 15:36:56 +0200
  • 28ee0c8583
    first draft for LORA finetune training xaedes 2023-08-16 15:31:04 +0200
  • c0a372fd3d
    add API functions to access remaining model parameters: xaedes 2023-08-16 15:30:31 +0200
  • 5b94b14d5d
    llama : fix strncpy warning + note token_to_str does not write null Georgi Gerganov 2023-08-16 15:28:09 +0300
  • a49931300a
    llama.cpp : fix line feed and compiler warning klosax 2023-08-16 14:43:48 +0200
  • dd6eaa32e4
    ggml : fix warnings about unused results Georgi Gerganov 2023-08-16 15:04:13 +0300
  • 1891c928a4
    dedup : CPU + Metal is working Georgi Gerganov 2023-08-16 14:56:51 +0300
  • d72a23e2f1
    gguf : better type names Georgi Gerganov 2023-08-16 14:37:07 +0300
  • 758ff1bbb5
    llama : refactor model loading code (#2620) Georgi Gerganov 2023-08-16 14:34:03 +0300
  • 6823899f2d
    llama : switch print order of meta data Georgi Gerganov 2023-08-16 14:32:59 +0300
  • e524750a6c
    llama : improve printing + log meta data Georgi Gerganov 2023-08-16 14:24:04 +0300
  • f634b292c9
    llama : throw error on missing KV paris in model meta data Georgi Gerganov 2023-08-16 13:44:35 +0300
  • c1fe0aba72
    llama : fix Windows build + fix norm_rms_eps key Georgi Gerganov 2023-08-16 13:09:43 +0300
  • ea5615a03a
    convert-llama-h5-to-gguf.py : clarify the reverse permute klosax 2023-08-16 11:23:15 +0200
  • 31fb56e1d3
    llama : fix shape prints Georgi Gerganov 2023-08-16 11:38:17 +0300
  • 5339b859ec
    llama : refactor llama_model_loader (WIP) Georgi Gerganov 2023-08-16 00:02:25 +0300
  • 075d079a72 Merge branch 'master' into concedo_experimental Concedo 2023-08-16 10:43:06 +0800
  • 444e781f09
    style-fix Shouzheng Liu 2023-08-15 22:24:24 -0400
  • f9bbc6f281 add missing include slaren 2023-08-16 03:56:52 +0200
  • 9b9905f9b8 metal: enable ggml-alloc lshzh-ww 2023-08-15 21:35:38 -0400
  • f2cf01ddd2 improve markdown formatting slaren 2023-08-16 02:39:15 +0200
  • 52b94f42c8
    add Bessel's correction to stdev calculation slaren 2023-08-16 00:25:59 +0200
  • 6ab6971242 add missing include slaren 2023-08-15 23:02:07 +0200
  • 6597d61ad7 fix msvc build slaren 2023-08-15 22:55:05 +0200
  • 7ec6158eec add to examples CMakeLists.txt slaren 2023-08-15 22:50:38 +0200
  • cfc7017b5a llama : add benchmark example slaren 2023-08-15 20:53:14 +0200
  • 23248d7d32
    llama : minor simplifications Georgi Gerganov 2023-08-15 22:41:55 +0300
  • f477fb069b
    llama : reorder definitions in .cpp to match .h Georgi Gerganov 2023-08-15 22:29:56 +0300
  • afd135a64c
    llama : merge gguf-util.h in llama.cpp Georgi Gerganov 2023-08-15 22:09:56 +0300