Commit Graph

  • 103cfafc77
    gguf : fix strings to not be null-terminated (#2839) b1096 Georgi Gerganov 2023-08-27 21:50:22 +0300
  • e23553c6c1
    gguf : fix gguf_add_tensor name Georgi Gerganov 2023-08-27 21:21:44 +0300
  • cd7b3edeed
    clarified wording JackJollimore 2023-08-27 14:51:57 -0300
  • 364d684b9a
    Improve README.md for building in Termux on Android devices JackJollimore 2023-08-27 14:44:20 -0300
  • 34e5c9afe5
    gguf : fix strings to not be null-terminated Georgi Gerganov 2023-08-27 20:43:32 +0300
  • 9c8b14bc47
    Improve README.md for building in Termux on Android devices. JackJollimore 2023-08-27 14:27:37 -0300
  • 46395e6311
    Merge branch 'ggerganov:master' into systemd-units JohnnyB 2023-08-27 17:03:33 +0100
  • c10704d01e
    llama : fix MPI threads (close #2827) b1095 Georgi Gerganov 2023-08-27 18:55:41 +0300
  • 6206ed6b72
    Missing dependency clblast JohnnyB 2023-08-27 16:55:02 +0100
  • bb24276c69 quantize : fix path parsing on Windows Cebtenzzre 2023-08-27 11:48:35 -0400
  • aeb19505bc Corrections and systemd units John Boero 2023-08-27 16:40:01 +0100
  • 230d46c723
    examples : update llama2.c converter to read vocab and write models in GGUF format (#2751) b1094 Olivier Chafik 2023-08-27 15:13:31 +0100
  • c2899b0fd1 CUDA: fix RoPE asserts, block sizes JohannesGaessler 2023-08-27 13:59:25 +0200
  • 463173a6c0
    llama : speedup tokenization (#2831) b1093 Kawrakow 2023-08-27 16:50:33 +0300
  • 2fae21ea78 llama-bench : set locale to utf8 slaren 2023-08-27 15:28:41 +0200
  • 86e3511500 Fixit: it was missing the piece after the last found occurence Iwan Kawrakow 2023-08-27 16:43:50 +0300
  • eaa13a48ff
    falcon : fix CUDA inference by making K and Q contiguous (#2830) b1092 Georgi Gerganov 2023-08-27 16:40:48 +0300
  • 5021d7bc3f Speedup tokenization Iwan Kawrakow 2023-08-27 16:28:40 +0300
  • cc924c57ee cuda : add assert to guard from non-cont ropes Georgi Gerganov 2023-08-27 16:00:55 +0300
  • 7c55447f7f falcon : fix CUDA inference by making K and Q contiguous Georgi Gerganov 2023-08-27 15:56:03 +0300
  • da7455d046
    readme : fix headings Georgi Gerganov 2023-08-27 15:52:34 +0300
  • 25423e9185
    scripts : helper convert script Georgi Gerganov 2023-08-27 15:24:40 +0300
  • a6d1189fdd
    k_quants tuning for Falcon-7b (#2816) b1089 Kawrakow 2023-08-27 15:19:59 +0300
  • c48c5bb0b0
    readme : update hot topics Georgi Gerganov 2023-08-27 14:44:35 +0300
  • d0cee0d36d
    gguf : add 64-bit support (GGUF v2) (#2821) b1087 Georgi Gerganov 2023-08-27 14:19:54 +0300
  • edd4c14817
    llama : more tokenizer fixes (#2810) b1086 Georgi Gerganov 2023-08-27 14:19:19 +0300
  • 841983fe47
    common : temporary separate llama_detokenize calls for SPM and BPE Georgi Gerganov 2023-08-27 13:04:04 +0300
  • 21df40d0c4 fix offloading logic JohannesGaessler 2023-08-27 11:21:26 +0200
  • 3bb0f84932
    tests : add falcon tests (py + cpp, currently do not pass Unicode) Georgi Gerganov 2023-08-27 11:26:48 +0300
  • 061f777de0 k_quants tuning for Falcon-7b Iwan Kawrakow 2023-08-27 11:33:19 +0300
  • 18a131d5e3 Make ggml-cuda.cu build with QK_K = 64 Iwan Kawrakow 2023-08-26 19:35:24 +0300
  • 1591e2e590
    ggml : detect SSSE3 (#2825) b1085 Przemysław Pawełczyk 2023-08-27 10:10:25 +0200
  • 958f5f7038
    add test with q8_0 (cpu only) slaren 2023-08-18 18:44:53 +0200
  • f430e7f821
    add 7b lora test slaren 2023-08-18 18:01:00 +0200
  • acca961dd7
    ci : decrease CPU ppl runs to 2 to avoide 20 min timeout Georgi Gerganov 2023-08-18 13:02:56 +0300
  • 6e5297bc16
    move lora summary to the top, add lora logs slaren 2023-08-18 03:14:58 +0200
  • 465a98886f
    ci : add lora test slaren 2023-08-18 02:54:25 +0200
  • 7f1c434e73 quantize : make output filename optional again Cebtenzzre 2023-08-27 01:21:27 -0400
  • 2d7a0fbe68 Replace make_half2 with __halves2half2 lijiahao 2023-08-27 11:14:32 +0800
  • af31f1f00d Use make_half2 for better compatibility lijiahao 2023-08-27 11:06:28 +0800
  • 9d5b4238e8 added config to class.py Concedo 2023-08-27 10:32:01 +0800
  • eed5d0e386 llama : show SSSE3 in system info Przemyslaw Pawelczyk 2023-08-27 02:26:32 +0200
  • dd8d05b918 ggml : add ggml_cpu_has_ssse3 Przemyslaw Pawelczyk 2023-08-27 02:25:13 +0200
  • 789c8c945a
    ci : add LoRA test to CI (#2650) slaren 2023-08-27 09:03:27 +0200
  • c1ac54b77a
    server : add /detokenize endpoint (#2802) b1083 Bruce MacDonald 2023-08-26 16:11:45 -0700
  • 21757ee5b6 Added FIM token IDs. apaz-cli 2023-08-26 17:11:41 -0500
  • c58792c768 llama2.c converter: cleanups + take n_ff from config ochafik 2023-08-26 23:09:22 +0100
  • 33a5517d87
    llama.cpp : print gguf version gguf-64bit klosax 2023-08-26 23:56:48 +0200
  • dbcf470bc6
    hellaswag : move the concat space for clarity Georgi Gerganov 2023-08-27 00:44:49 +0300
  • ab3ba64f62
    llama.cpp : fix LF token klosax 2023-08-26 23:03:01 +0200
  • 0722e58ac2 llama2.c: escape whitespaces w/ U+2581 in vocab converter the llama.cpp way ochafik 2023-08-26 22:43:00 +0100
  • c767746399
    Merge branch 'master' into fix-tokenizer Georgi Gerganov 2023-08-27 00:42:05 +0300
  • eb8b3264f6
    tests : add test-tokenizer-1.py Georgi Gerganov 2023-08-27 00:41:44 +0300
  • 20c44711bc llama2.c: use defines for gguf keys ochafik 2023-08-26 21:41:53 +0100
  • b61b170005
    gguf : fix typo Georgi Gerganov 2023-08-26 23:14:19 +0300
  • 730d9c681e
    convert.py : advanced option (#2753) Kerfuffle 2023-08-26 14:13:36 -0600
  • df3b81ab29 llama2.c: update default path for vocab model + readme ochafik 2023-08-26 20:59:46 +0100
  • 09b6da741e
    gguf.py : string len uint64_t and n_dims uint32_t klosax 2023-08-26 21:53:56 +0200
  • 6d369a1558
    gguf : update all counts to 64-bit Georgi Gerganov 2023-08-26 22:41:55 +0300
  • bc3eaf262e
    gguf.py : string lengths uint32_t klosax 2023-08-26 21:29:36 +0200
  • be726c57ee
    gguf.py : uint64_t on all lengths, sizes and counts, enums still uint32_t klosax 2023-08-26 21:23:12 +0200
  • ba335ff5b2
    gguf.py : bump GGUF version Georgi Gerganov 2023-08-26 22:13:05 +0300
  • 3656b3ce81
    gguf : v1 backwards comp Georgi Gerganov 2023-08-26 22:11:42 +0300
  • 4f0547e4a3
    gguf : add support for 64-bit (no backwards comp yet) Georgi Gerganov 2023-08-26 22:05:14 +0300
  • 2978e03086
    update readme with gguf filenames xaedes 2023-08-26 21:04:14 +0200
  • 167dd2dcec
    add checkpoint file version for future compatibility xaedes 2023-08-26 21:04:01 +0200
  • ff83017428 improve ggml_vec_dot_q4_K_q8_K AVX2 10% by reducing instruction dependency Ronny Brendel 2023-08-22 10:39:36 +0200
  • c88362d194 llama2.c: support copying vocab from a llama gguf model file ochafik 2023-08-26 19:47:11 +0100
  • 5f1fffd2d4
    gguf : bump version to 2 Georgi Gerganov 2023-08-26 21:52:27 +0300
  • c7d92e6dfe
    llama : use Unicode Escape Sequence to replace encoded characters (#2814) b1081 Tim Miller 2023-08-27 03:27:07 +0900
  • 61d1a2895e
    flake.nix : add rocm support and cleanup (#2808) Tungsten842 2023-08-26 20:19:44 +0200
  • 741ca7dd1c
    llama : move #includes out of _GNU_SOURCE conditional (#2817) b1079 Cebtenzzre 2023-08-26 14:17:51 -0400
  • 72f895c923
    main : fix bug (penalize_nl=false doesn't work) + suppress warning on mingw (#1528) b1078 Dr. Tom Murphy VII Ph.D 2023-08-26 14:12:56 -0400
  • 1bf050c2a8
    main : pass ctx to llama_token_nl() Georgi Gerganov 2023-08-26 21:06:41 +0300
  • 0d29c8aaef
    Merge branch 'master' into master Georgi Gerganov 2023-08-26 21:03:38 +0300
  • f5d4b48297
    main : fix indentation Georgi Gerganov 2023-08-26 21:02:41 +0300
  • 8a136017f0 remove trailing white-space Bruce MacDonald 2023-08-26 13:27:27 -0400
  • 46ec18406f llama : move #includes out of _GNU_SOURCE conditional Cebtenzzre 2023-08-26 13:08:50 -0400
  • 50526f37eb
    llama : use std::abs in llama_sample_tail_free (#2800) b1077 Cebtenzzre 2023-08-26 12:53:52 -0400
  • e4324cbd4d
    tests : add option to tokenize text files Georgi Gerganov 2023-08-26 19:21:22 +0300
  • 5cea869275 fix stray whitespace after master sync staviq 2023-08-26 18:08:55 +0200
  • 5031c50e48
    Merge branch 'master' into betterlogs staviq 2023-08-26 17:57:10 +0200
  • e99f039c9e cleanup main.cpp:273 staviq 2023-08-26 17:52:25 +0200
  • 9be7e2b0bd llama : use std::abs in llama_sample_tail_free Cebtenzzre 2023-08-25 18:16:58 -0400
  • 5f23d41faa Refactor types KerfuffleV2 2023-08-26 09:23:06 -0600
  • 70005bd5c9
    tests : use Python to generate tokenizer tests for C++ Georgi Gerganov 2023-08-26 18:05:59 +0300
  • dfa058ef73
    examples : no longer manually add leading space when tokenizing Georgi Gerganov 2023-08-26 17:51:35 +0300
  • 1e7a033f10
    common : add comments Georgi Gerganov 2023-08-26 17:42:33 +0300
  • 04f4b1eb10
    k-quants : remove unnecessary tensor shape restrictions (#2811) b1076 Georgi Gerganov 2023-08-26 17:37:35 +0300
  • 9668aa115c
    llama : distinguish pieces from decoded text + fix detokenization Georgi Gerganov 2023-08-26 17:35:45 +0300
  • 7592375403
    Better perplexity for 2- and 3-bit quantization for LLaMA-v2-70B (#2807) b1075 Kawrakow 2023-08-26 17:27:49 +0300
  • 9ee138e62f Use Unicode Escape Sequence to replace encoded characters Tim Miller 2023-08-26 23:17:48 +0900
  • 5d0ffb69f5
    llama : prefix input text for tokenization with whitespace Georgi Gerganov 2023-08-26 17:08:59 +0300
  • 771551a793
    Fix HellaSwag (#2805) b1074 Kawrakow 2023-08-26 16:48:53 +0300
  • 3979af1e58 PR comment Iwan Kawrakow 2023-08-26 16:44:22 +0300
  • b398d885a1 flake.nix: add rocm support and cleanup Tungsten842 2023-08-26 14:49:15 +0200
  • f305bad11e
    flake : build llama.cpp on Intel with nix (#2795) Volodymyr Vitvitskyi 2023-08-26 14:25:39 +0100
  • 6a20f7a2f0
    bug fixes xaedes 2023-08-25 22:32:39 +0200
  • d01f52409f Added const if possible lijiahao 2023-08-26 21:14:09 +0800
  • eff86d4f13
    k-quants : remove unnecessary tensor shape restrictions Georgi Gerganov 2023-08-26 16:05:18 +0300