Commit Graph

  • 593b04fdcd
    convert-llama-7b-pth-to-gguf.py : remove extra kv klosax 2023-08-19 00:32:27 +0200
  • c0e4ca630b
    convert-gptneox-hf-to-gguf.py : remove extra kv klosax 2023-08-19 00:31:56 +0200
  • 16ab9ba3b3
    convert-falcon-hf-to-gguf.py : remove extra kv klosax 2023-08-19 00:31:28 +0200
  • d5e976c12b
    falcon-main.cpp : falcon inference example klosax 2023-08-19 00:02:18 +0200
  • 95f2c5d475
    Merge 1c154e9ea5 into 1f0bccb279 Eve 2023-08-18 21:50:59 +0000
  • 1c154e9ea5 lazy fix for llama-bench (runs without pp_threads support) netrunnereve 2023-08-18 17:49:04 -0400
  • 1f0bccb279
    server : better default prompt (#2646) Georgi Gerganov 2023-08-19 00:45:36 +0300
  • f63564adfa
    server : update xxd usage for older versions compatibility (#2649) Jhen-Jie Hong 2023-08-19 05:41:32 +0800
  • a129a31457
    Merge branch 'ggerganov:master' into master Eve 2023-08-18 21:17:06 +0000
  • a217151444 add gqa parameter (Llama 2 70b support) Colin Calvert 2023-08-18 15:41:51 -0500
  • 2d8b76a110
    Add link to clojure bindings to Readme. (#2659) Adrian 2023-08-18 12:39:22 -0700
  • 37dfb544aa
    resolve todo xaedes 2023-08-18 21:22:41 +0200
  • 3e47890760
    remove unnecessary src tensor from ggml_repeat & ggml_repeat_back xaedes 2023-08-18 20:51:00 +0200
  • 65b0561637
    remove unnecessary src tensor from ggml_get_rows_back xaedes 2023-08-18 20:25:03 +0200
  • fb7c883cd3
    convert-falcon-hf-to-gguf.py : falcon HF --> gguf conversion, not tested klosax 2023-08-18 20:14:01 +0200
  • 6c98640035
    bug fix: make sure finetune input gradient is allocated at begin and kept until end xaedes 2023-08-18 20:10:04 +0200
  • 210b07f980
    Add link to clojure bindings to Readme. Adrian 2023-08-18 11:08:44 -0700
  • 63cb374a99
    change default finetune params lora_r and lora_alpha to match the n_rank parameters of 4 xaedes 2023-08-18 19:08:15 +0200
  • 25b8a8922d
    llama : introduce enum llama_vocab_type + remove hardcoded string constants Georgi Gerganov 2023-08-18 18:46:38 +0300
  • 7a63d429af
    adjust maximal values to support finetuning 3B models xaedes 2023-08-18 17:32:31 +0200
  • 7af633aec3
    readme : incoming BREAKING CHANGE Georgi Gerganov 2023-08-18 17:48:31 +0300
  • a4ad2bf35c
    llama : fix MPI build Georgi Gerganov 2023-08-18 17:34:27 +0300
  • 5d2656d670
    llama : avoid hardcoded special tokens Georgi Gerganov 2023-08-18 17:29:20 +0300
  • 113c90f1cc
    improve optimization iteration prints xaedes 2023-08-18 16:24:42 +0200
  • a0c2752ba7
    remove debug prints and function to compute tensor data hash xaedes 2023-08-18 16:24:13 +0200
  • 035d511457
    llama : minor API updates Georgi Gerganov 2023-08-18 17:06:34 +0300
  • 011f47f972
    remove trailing whitespace xaedes 2023-08-18 16:02:46 +0200
  • f358204a5f
    avoid keeping in memory ALL of the gradients xaedes 2023-08-18 16:01:43 +0200
  • 2d6c2c757c
    llama : remove C++ API + reorganize common source in /common dir Georgi Gerganov 2023-08-18 16:22:48 +0300
  • a252111b45
    fix bug in ggml_out_prod which resulted in wrong n_dims of result tensors xaedes 2023-08-18 15:03:57 +0200
  • 44526cb261
    make sure base model tensors data cannot be used in viewable operations xaedes 2023-08-18 15:03:17 +0200
  • 38016ed9ec
    Merge branch 'master' into gguf Georgi Gerganov 2023-08-18 15:21:48 +0300
  • 660ca9bbca
    llama : re-order functions Georgi Gerganov 2023-08-18 14:56:36 +0300
  • 097e121e2f
    llama : add benchmark example (#2626) master-097e121 slaren 2023-08-18 12:44:58 +0200
  • eaf98c2649
    readme : add link to Rust bindings (#2656) mdrokz 2023-08-18 15:47:58 +0530
  • 80e1ca4853 chore: add rust bindings to readme mdrokz 2023-08-18 15:37:06 +0530
  • e9b12c332e
    perplexity : more meaningful ETA number - 2 decimal points master-e9b12c3 Georgi Gerganov 2023-08-18 12:48:55 +0300
  • dea5be61d7
    editorconfig : fix whitespaces Georgi Gerganov 2023-08-18 12:42:38 +0300
  • e35f8c744e
    tests : update vocab file with new magic Georgi Gerganov 2023-08-18 12:39:09 +0300
  • 856afff746
    Merge branch 'master' into gguf Georgi Gerganov 2023-08-18 12:38:05 +0300
  • aa3efe87c8
    llama : print number of tensors per type + print arch + style Georgi Gerganov 2023-08-18 10:36:45 +0300
  • 06a883f7b1 remove unused $func Jhen 2023-08-18 15:35:20 +0800
  • a7871acced Merge remote-tracking branch 'origin/master' into prompt-array Xiao-Yong Jin 2023-08-17 21:49:35 -0500
  • b275de745d
    llama.cpp : get special token kv and linefeed token id klosax 2023-08-18 03:34:30 +0200
  • 604b8bdfa6
    Fix unicode in grammars (fixes #2501) (#2553) master-604b8bd Evan Jones 2023-08-17 19:54:44 -0400
  • 5c6aee64de Merge branch 'master' into server-probs jhen 2023-08-18 07:38:35 +0800
  • 10151bee2e
    server : support for saving templates in browser LocalStorage (#2486) master-10151be staviq 2023-08-17 23:34:01 +0000
  • af2cd7f8be fix test-llama-grammar Evan Jones 2023-08-17 19:17:43 -0400
  • 306070c896
    llama.cpp : print kv general.name klosax 2023-08-18 01:06:27 +0200
  • e029b50351 Merge remote-tracking branch 'upstream/master' into fix-unicode-2 Evan Jones 2023-08-17 19:01:19 -0400
  • 0bb897c82a
    bug fix: actually use result type passed to ggml_add_cast xaedes 2023-08-17 23:48:30 +0200
  • 15f7448611 server : update xxd usage for older versions compatibility Jhen 2023-08-18 06:16:43 +0800
  • 0992a7b8b1
    README: fix LLAMA_CUDA_MMV_Y documentation (#2647) Johannes Gäßler 2023-08-17 23:57:59 +0200
  • e830035c1c server : use xxd in public/ for simplify func name Jhen 2023-08-18 05:45:08 +0800
  • 1664826c5a README: fix LLAMA_CUDA_MMV_Y documentation JohannesGaessler 2023-08-17 23:37:54 +0200
  • d9e6890a51
    test-tokenizer-0.cpp : fix warning klosax 2023-08-17 23:34:21 +0200
  • 147a99bd3a
    gguf.py : reverse GGUF_MAGIC klosax 2023-08-17 23:24:04 +0200
  • c20ae49b59
    ggml.h : reverse GGUF_MAGIC klosax 2023-08-17 23:23:17 +0200
  • 3b4368471a
    server : better default prompt Georgi Gerganov 2023-08-17 23:27:42 +0300
  • df87dd74a5 formatting slaren 2023-08-17 22:11:40 +0200
  • 6ddeefad9b
    [Zig] Fixing Zig build and improvements (#2554) Henri Vasserman 2023-08-17 23:11:18 +0300
  • 3c1b7217a9
    convert-llama-7b-pth-to-gguf.py : fixes klosax 2023-08-17 21:44:34 +0200
  • 9e2d4dd48e
    convert-llama-hf-to-gguf.py : fixes klosax 2023-08-17 21:43:48 +0200
  • 640ddc4259
    gguf.py : gptneox mapping klosax 2023-08-17 21:43:10 +0200
  • b668cd3296
    convert-gptneox-hf-to-gguf.py : fixes klosax 2023-08-17 21:42:26 +0200
  • fc3a523211
    gguf.py : write tensors in a single pass (#2644) M. Yusuf Sarıgöz 2023-08-17 21:57:39 +0300
  • 6a9e6375b5
    gguf.py : indentation gguf-write-single-pass Georgi Gerganov 2023-08-17 21:53:15 +0300
  • 307e09cd85
    Merge branch 'gguf' into gguf-write-single-pass Georgi Gerganov 2023-08-17 21:51:15 +0300
  • e426b3cfc8
    gguf.py : fix vertical alignment Georgi Gerganov 2023-08-17 21:50:01 +0300
  • 5484737d58
    llama : fix tensor name grepping during quantization Georgi Gerganov 2023-08-17 21:40:51 +0300
  • 57eaadb853
    llama : throw error if gguf fails to init from file Georgi Gerganov 2023-08-17 21:31:52 +0300
  • b3cc182990
    llama.cpp : typo klosax 2023-08-17 20:27:50 +0200
  • acaa98234a
    convert.py : fix HF tensor permuting / unpacking Georgi Gerganov 2023-08-17 21:06:45 +0300
  • 78e1e57862
    quantize-stats.cpp : .bin --> .gguf klosax 2023-08-17 19:18:24 +0200
  • fb11dd3f92
    common.h : .bin --> .gguf klosax 2023-08-17 19:16:35 +0200
  • e72c8c2124
    ggml : fix bug in gguf_set_kv Georgi Gerganov 2023-08-17 20:13:12 +0300
  • 4dbce7d009 gguf : rm file_type key and method M. Yusuf Sarıgöz 2023-08-17 20:02:38 +0300
  • 1d93d04ce2 gguf : refactor pth to gguf conversion script M. Yusuf Sarıgöz 2023-08-17 19:58:27 +0300
  • 899f9a5350
    llama : fix lambda capture Georgi Gerganov 2023-08-17 19:49:21 +0300
  • 93f285bdf1
    gptneox : move as a WIP example Georgi Gerganov 2023-08-17 19:38:48 +0300
  • f71704177f gguf : rename h5 to hf (for HuggingFace) M. Yusuf Sarıgöz 2023-08-17 19:49:15 +0300
  • 81a2c2a6f4
    llama : fix llama_model_loader memory leak Georgi Gerganov 2023-08-17 19:49:02 +0300
  • 9f02694c91 gguf : refactor gptneox conversion script M. Yusuf Sarıgöz 2023-08-17 19:45:06 +0300
  • dd9e2fc988
    ci : update ".bin" to ".gguf" extension Georgi Gerganov 2023-08-17 19:32:14 +0300
  • c3b739374e
    editorconfig : ignore models folder Georgi Gerganov 2023-08-17 19:17:25 +0300
  • 22c61c5b45 gguf : style fixes in simple conversion script M. Yusuf Sarıgöz 2023-08-17 19:05:43 +0300
  • 6d66ef96eb
    Merge branch 'master' into gguf Georgi Gerganov 2023-08-17 19:04:59 +0300
  • 11bf4366c2
    llama : sync with recent PRs on master Georgi Gerganov 2023-08-17 19:03:15 +0300
  • 2f8fc92d86 gguf : fix conflicts M. Yusuf Sarıgöz 2023-08-17 18:51:14 +0300
  • 8ace03ad3d
    convert.py : better always have n_head_kv and default it to n_head Georgi Gerganov 2023-08-17 18:47:06 +0300
  • b6c81e28cd improve formatting slaren 2023-08-17 17:28:55 +0200
  • d646c4efce
    convert.py : n_head_kv optional and .gguf file extension klosax 2023-08-17 17:20:36 +0200
  • 36b0c5b398 fix for incorrect missing backends displayed Concedo 2023-08-17 22:45:49 +0800
  • 5b8485b6ae Regen index.html.cpp, suggested from code review staviq 2023-08-17 16:42:45 +0200
  • 4a18c88143
    Merge branch 'ggerganov:master' into master staviq 2023-08-17 14:34:02 +0000
  • bd815c9c86
    Apply suggestions from code review staviq 2023-08-17 14:29:36 +0000
  • dd016cc246
    Revert "ci : disable CI temporary to not waste energy" Georgi Gerganov 2023-08-17 17:23:16 +0300
  • 2ddd9681d6
    convert.py : update to support GGUF output Georgi Gerganov 2023-08-17 17:22:43 +0300
  • e0429d38e4
    convert-new.py : output gguf (#2635) Georgi Gerganov 2023-08-17 17:19:52 +0300
  • 663d952abb
    llama : style fixes Georgi Gerganov 2023-08-17 17:19:31 +0300