Commit Graph

  • a02b809a2e
    llama : move hparams and vocab from gguf_file_loader to llama_model_loader Georgi Gerganov 2023-08-15 21:09:27 +0300
  • 4a1741aa2d
    gptneox-main.cpp : add tensor data layout klosax 2023-08-15 19:56:19 +0200
  • 2ae0e985b3
    convert-llama-7b-pth-to-gguf.py : add tensor data layout klosax 2023-08-15 19:55:13 +0200
  • 66756c82af
    convert-llama-h5-to-gguf.py : add tensor data layout klosax 2023-08-15 19:54:33 +0200
  • 6c3f824697
    llama : simplify gguf_file_loader Georgi Gerganov 2023-08-15 20:53:53 +0300
  • b6056c3db8
    gguf.py : add tensor data layout klosax 2023-08-15 19:53:44 +0200
  • 2906d5492d
    gguf : remove obosolete gguf_get_arr_xxx API Georgi Gerganov 2023-08-15 20:46:18 +0300
  • 1751bd4693
    gguf : remove oboslete write methods Georgi Gerganov 2023-08-15 20:41:53 +0300
  • f7a6aa9911
    gguf : streaming support when writing files Georgi Gerganov 2023-08-15 19:57:37 +0300
  • a527eccb43 metal: fix bugs for GQA and perplexity test. lshzh-ww 2023-08-15 11:31:13 -0400
  • 4ef5e792e3
    llama : replace gguf_file_saver with new gguf write API Georgi Gerganov 2023-08-15 16:30:07 +0300
  • 03297c1a7c simply code and allow to use the directory containing the file as a valid value as intented in the first place Marc 2023-08-15 16:35:06 +0200
  • 7e88677af4 Add support for q4_1, q5_0, q5_1 and q8_0 0cc4m 2023-08-15 15:38:57 +0200
  • 35177d735d
    gguf : minor Georgi Gerganov 2023-08-15 16:05:23 +0300
  • c9b2f7f1bf
    gguf : fixes + simplify example + add ggml_nbytes_pad() Georgi Gerganov 2023-08-15 16:01:38 +0300
  • 9eb1ef8653
    move and remove code xaedes 2023-08-15 14:03:02 +0200
  • 5e059ace25
    add stub example for finetuning, based on train-text-from-scratch xaedes 2023-08-15 13:54:28 +0200
  • 316b0707f4
    add API functions to access llama model tensors xaedes 2023-08-06 17:28:22 +0200
  • 4463965401
    gguf : fix header write Georgi Gerganov 2023-08-15 14:39:27 +0300
  • f6ecd15f83
    gguf : initial write API ready + example Georgi Gerganov 2023-08-15 14:35:00 +0300
  • 85ebfb8e5d
    gguf : write to file API (not tested) Georgi Gerganov 2023-08-15 14:26:28 +0300
  • 5cb9d9a87f
    gguf : initial write API (not tested yet) Georgi Gerganov 2023-08-15 13:39:10 +0300
  • 2d87c9c796
    llama : refactor tensor names (#2622) M. Yusuf Sarıgöz 2023-08-15 13:29:30 +0300
  • 29743cb83b gguf : define tensor names as constants M. Yusuf Sarıgöz 2023-08-15 12:54:11 +0300
  • 693bd398c5 gguf: update tensor names searched in quantization M. Yusuf Sarıgöz 2023-08-15 12:37:10 +0300
  • da424b6699
    llama : gguf_file_saver write I32 Georgi Gerganov 2023-08-15 11:31:42 +0300
  • 9574f41818
    llama : no need to pass full file loader to the file saver Georgi Gerganov 2023-08-15 11:22:37 +0300
  • 5c85332e99
    llama : simplify write_header() Georgi Gerganov 2023-08-15 11:11:22 +0300
  • 6e29ed52fb
    llama : fix method names Georgi Gerganov 2023-08-15 11:10:26 +0300
  • c9c0b758d4
    llama : simplify gguf_file_saver Georgi Gerganov 2023-08-15 11:09:26 +0300
  • 66ce19aecb
    llama : fix quantization using gguf tool Georgi Gerganov 2023-08-15 10:55:42 +0300
  • a82e3a4d92
    llama : style formatting + remove helper methods Georgi Gerganov 2023-08-15 08:51:07 +0300
  • b5ffb2849d
    scripts : add helper script to get wikitext Georgi Gerganov 2023-08-15 10:04:58 +0300
  • c545d85f83 Merge branch 'gguf' of https://github.com/ggerganov/llama.cpp into gguf goerch 2023-08-15 08:24:56 +0200
  • 99e0e90718 Improved tokenizer test goerch 2023-08-15 08:23:35 +0200
  • 469d70be45 add support for precompiled binaries, used as a fallback Concedo 2023-08-15 13:49:05 +0800
  • d2049bf03f fix lint and add Makefile drbh 2023-08-15 00:07:24 -0400
  • 995ddb963d adds simple llama grammar tests drbh 2023-08-14 23:55:45 -0400
  • bfa455de43 metal: fix performance degradation from gqa lshzh-ww 2023-08-14 23:10:27 -0400
  • 5f6de2a2bb metal: matrix-matrix multiplication kernel lshzh-ww 2023-08-14 21:11:19 -0400
  • 2dd5d2c92c
    convert-llama-h5-to-gguf.py : add 70b gqa support klosax 2023-08-15 00:43:10 +0200
  • 006e74a493 Merge branch 'master' into server-probs jhen 2023-08-15 06:14:57 +0800
  • 3ebb00935f
    server : add missing /json-schema-to-grammar.mjs (#2616) master-3ebb009 Jhen-Jie Hong 2023-08-15 06:14:14 +0800
  • ca4758290c
    gguf-llama.cpp : fix n_head_kv klosax 2023-08-14 23:18:41 +0200
  • 6a316fc1ab server : add missing /json-schema-to-grammar.mjs jhen 2023-08-15 04:05:34 +0800
  • ab2cbd03ca
    convert-llama-7b-pth-to-gguf.py : add token types klosax 2023-08-14 22:10:50 +0200
  • 3b5515bbe0
    reverse order of for loop in ggml_build_backward_expand to save memory when using gradient checkpointing and allocator xaedes 2023-08-14 22:09:36 +0200
  • cedb4870c6
    gguf.py : add token types klosax 2023-08-14 22:08:40 +0200
  • 5d518d421f
    constants.py : add token types klosax 2023-08-14 22:07:53 +0200
  • 7ec125b1dc
    convert-llama-h5-to-gguf.py : add token types klosax 2023-08-14 22:06:33 +0200
  • 56228461c8
    fix memory "leak" in optimizers xaedes 2023-08-14 21:12:02 +0200
  • 6c63550f63
    llama : update tokenizer style Georgi Gerganov 2023-08-14 22:10:19 +0300
  • 3e6468b097
    fix test when to create temporary backward graph xaedes 2023-08-14 20:56:03 +0200
  • 098654c277
    only use ggml_allocr_alloc when tensor has NULL data and is no view xaedes 2023-08-14 20:56:56 +0200
  • faf3e21eaf
    add debug asserts in ggml_allocr_alloc to some common pitfalls when using this function directly xaedes 2023-08-14 20:50:09 +0200
  • 7494c78428
    llama : sync gguf-llama with llama (#2613) Georgi Gerganov 2023-08-14 21:33:33 +0300
  • e4b8f94d6b
    convert : update HF converter to new tokenizer voodoo magics Georgi Gerganov 2023-08-14 21:31:02 +0300
  • 95d7593e4a
    llama : sync gguf-llama.cpp Georgi Gerganov 2023-08-14 21:18:19 +0300
  • c35fc0bbb0
    convert : fix layer names Georgi Gerganov 2023-08-14 21:06:07 +0300
  • 01080a5a51
    tests : fix wstring_convert Georgi Gerganov 2023-08-14 20:50:15 +0300
  • aa0551a504
    tests : fix build + warnings (test-tokenizer-1 still fails) Georgi Gerganov 2023-08-14 20:14:55 +0300
  • 58fdf3a07a
    llama : sync gguf-llama with llama Georgi Gerganov 2023-08-14 19:58:05 +0300
  • afc4ca2889
    convert : update convert-new.py with tokenizer fixes (#2614) goerch 2023-08-14 19:20:04 +0200
  • c9c3b87a9e Merge branch 'gguf' of https://github.com/goerch/llama.cpp into gguf goerch 2023-08-14 19:11:44 +0200
  • cfb0e6ff16 Adapt convert-new.py (and fix a clang-cl compiler error on windows) goerch 2023-08-14 19:08:44 +0200
  • 6e280b24dc
    remove unused forward_batch function xaedes 2023-08-14 19:02:12 +0200
  • 3794dceb7f
    remove unused train params: mem_compute1_gb & mem_compute2_gb xaedes 2023-08-14 18:44:42 +0200
  • 6f161c784b
    remove trailing whitespace xaedes 2023-08-14 18:33:27 +0200
  • 271e4d64b5
    remove unused training parameters "use_scratch" and "use_unified" xaedes 2023-08-14 18:31:59 +0200
  • c954f41ca4
    remove handwritten training functions xaedes 2023-08-14 18:27:01 +0200
  • ec1b100720
    llama : tokenizer fixes (#2549) goerch 2023-08-14 18:30:28 +0200
  • fe788a1c7a
    allocate graph on context using ggml_new_graph xaedes 2023-08-14 18:24:13 +0200
  • 75baed230c
    set names for tensors in unified train function for easier debugging xaedes 2023-08-14 18:17:14 +0200
  • 3e99a8d653
    format name of cloned tensors with " (clone)" suffix xaedes 2023-08-14 18:15:09 +0200
  • 865c4cd3c1
    integrate unified training function which may use memory allocator xaedes 2023-08-14 18:12:58 +0200
  • 4ed096c6b0
    add training options whether to use allocator and/or unified training function xaedes 2023-08-14 18:10:02 +0200
  • d6c5b03858
    fix ASSERT to work with zero layers xaedes 2023-08-14 18:08:19 +0200
  • 38f4438c32
    make sure some tensors are not reallocated by inserting new temporary nodes depending on them: xaedes 2023-08-14 18:07:16 +0200
  • 9716eb8ef0
    fix variable name and add missing boolean negation xaedes 2023-08-14 17:59:19 +0200
  • 5884b43a62
    add input tensors as checkpoints xaedes 2023-08-14 17:58:49 +0200
  • b2f1310196
    swap arguments to commutative ops to be the same as in forward_batch_wo_cache_flash_attn xaedes 2023-08-14 17:57:13 +0200
  • 5a11b75875
    fix variable names xaedes 2023-08-14 17:55:51 +0200
  • 345f516f7c
    correctly clone view tensors by setting data pointers xaedes 2023-08-14 17:55:13 +0200
  • 52c92c0a8c
    terminate recursive tensor cloning when reaching tensor without src tensors xaedes 2023-08-14 17:53:36 +0200
  • 0dd496c5e2
    fix variable name and add missing type cast xaedes 2023-08-14 17:52:48 +0200
  • cfddc36be2
    correctly clone reshape and permute operations by also cloning tensor->nb values xaedes 2023-08-14 17:52:15 +0200
  • d43741540b
    don't use allocate hash_map on context xaedes 2023-08-14 17:51:20 +0200
  • fc826c8ea8
    in train function replace add_inplace by regular add xaedes 2023-08-14 17:49:22 +0200
  • 7108448841 Merge branch 'gguf' of https://github.com/goerch/llama.cpp into gguf goerch 2023-08-14 16:36:45 +0200
  • fb591e1f04 Merge branch 'gguf' of https://github.com/ggerganov/llama.cpp into gguf goerch 2023-08-14 16:35:57 +0200
  • 712b614ad4
    Merge branch 'gguf' into gguf goerch 2023-08-14 16:22:50 +0200
  • 8af3a99ff1
    Merge branch 'master' into gguf Georgi Gerganov 2023-08-14 16:39:18 +0300
  • 6f14854880
    gitignore : add gptneox-main Georgi Gerganov 2023-08-14 16:39:02 +0300
  • d783f7982e
    metal : return null instead of exit(1) (#2573) master-d783f79 Jhen-Jie Hong 2023-08-14 21:37:39 +0800
  • d75561df20
    server : add --numa support (#2524) master-d75561d Cheng Shao 2023-08-14 15:36:42 +0200
  • 348acf188c
    llama : add missing enum keyword in function signatures (#2610) master-348acf1 Kamil Tomšík 2023-08-14 15:35:16 +0200
  • f00780b2ee
    llama : sync gguf-llama.cpp with latest llama.cpp (#2608) Georgi Gerganov 2023-08-14 16:28:44 +0300
  • 18d00611e2
    llama : minor Georgi Gerganov 2023-08-14 16:26:40 +0300
  • f85395252f
    llama : refactor gguf_buffer and gguf_ctx_buffer Georgi Gerganov 2023-08-14 14:44:55 +0300
  • 6f64b6c0f8
    Create convert-llama-7b-pth-to-gguf.py klosax 2023-08-14 13:51:09 +0200