Commit Graph

  • 7694adda8d
    Fix for main example getting stuck when -n -2 and --interactive (#2767) b1050 Kerfuffle 2023-08-24 10:11:13 -0600
  • fea95c682d
    fix convert.py for codellama, add llama 34B to the list of recognized models (#2768) b1049 slaren 2023-08-24 17:44:11 +0200
  • f06caa3723 fix convert.py for codellama, add llama 34B to the list of recognized models slaren 2023-08-24 17:30:24 +0200
  • c2f1790be9 Add a comment so future generations may suffer less. KerfuffleV2 2023-08-24 09:21:39 -0600
  • e979cef2ff Attempted fix for main example getting stuck when -n -2 and --interactive KerfuffleV2 2023-08-24 09:09:49 -0600
  • 4c4e4358ed fixed linux build error Concedo 2023-08-24 22:12:56 +0800
  • ef955fbd23
    Tag release with build number (#2732) b1048 DannyDaemonic 2023-08-24 06:58:02 -0700
  • 4072f20bba
    add missing lctx argument to get_example_targets_batch xaedes 2023-08-24 15:49:34 +0200
  • 0c52c65d7f
    Merge branch 'master' into pr-train-mem-usage-improvements xaedes 2023-08-24 15:46:52 +0200
  • d67777c202
    metal : add Q8_0 support (#2763) Georgi Gerganov 2023-08-24 16:19:57 +0300
  • 661bede62f optimize tokenize method Concedo 2023-08-24 21:16:16 +0800
  • 84e8da665d
    ggml.c : use ggml_float for gelu klosax 2023-08-24 15:13:18 +0200
  • 1202e06c6f
    metal : add Q8_0 mul_mm kernel Georgi Gerganov 2023-08-24 15:42:29 +0300
  • b95a4ccb22 added a token counting endpoint, set mmq as default Concedo 2023-08-24 20:41:49 +0800
  • 61c8259a88
    metal : add mul_mat_q8_0_f32 kernel Georgi Gerganov 2023-08-24 15:32:27 +0300
  • 797312e758
    ggml.c : use double precision for tanh klosax 2023-08-24 14:18:57 +0200
  • 7ec7ef94a9 skip-unused: disable skipping on ROCm / when LLAMA_USE_HIPBLAS ochafik 2023-08-23 21:36:56 +0100
  • 2cf4f62e12 Skip computation of unused logits during batch prompt eval (drop other batch positions after writing their kv to cache) ochafik 2023-08-18 01:46:20 +0100
  • d30cb53c9d metal : use metal_printf for debug logging Ravindra Marella 2023-08-24 16:50:05 +0530
  • 8e2b5abaa4 Fix SAFE_NAME Danny Daemonic 2023-08-24 04:04:53 -0700
  • 238335f54f
    fix -nommq help for non CUDA/HIP Henri Vasserman 2023-08-24 14:03:31 +0300
  • 81ecaa4b6c
    fix llama-bench Henri Vasserman 2023-08-24 13:52:51 +0300
  • a9efc5e417 Prefix the build number with b Danny Daemonic 2023-08-24 03:50:49 -0700
  • a60231f786
    Add Dockerfiles Henri Vasserman 2023-08-24 13:45:05 +0300
  • 46a0881c7f
    metal : add dequantize_q8_0 kernel Georgi Gerganov 2023-08-24 13:40:34 +0300
  • 058f905ef9
    ignore all build dirs Henri Vasserman 2023-08-24 13:23:23 +0300
  • 7b842170c4
    Merge 'origin/master' into hipblas Henri Vasserman 2023-08-24 13:18:58 +0300
  • c3e53b421a
    llama : escape all U+2581 in a string (#2750) b1047 Georgi Gerganov 2023-08-24 12:26:01 +0300
  • ac4bb6ba02 cuda : add RoPE kernel for mode == 2 (NeoX) Georgi Gerganov 2023-08-24 11:13:56 +0300
  • 81a0ef342c updated lite, switched to unminified source Concedo 2023-08-24 16:26:38 +0800
  • 598d4d89ab fix for config file loading. from kcpp settings file Concedo 2023-08-24 15:45:33 +0800
  • a3b9949626 Merge remote-tracking branch 'pop/add_config_arg' into concedo_experimental Concedo 2023-08-24 15:22:17 +0800
  • b8372d4466 Merge branch 'master' into concedo_experimental Concedo 2023-08-24 15:21:24 +0800
  • 0288361b65 gguf : fix line endings M. Yusuf Sarıgöz 2023-08-24 09:26:13 +0300
  • 344f6e373b gguf: prepare as Pip package M. Yusuf Sarıgöz 2023-08-24 09:09:52 +0300
  • 5dd870574e gguf: prepare as Pip package M. Yusuf Sarıgöz 2023-08-24 09:08:19 +0300
  • 050046fa45 gitignore : add dist and rm pyproject.toml M. Yusuf Sarıgöz 2023-08-24 09:07:42 +0300
  • 0c268a83e8 ggml-alloc: avoid return silently lshzh-ww 2023-08-24 01:34:57 -0400
  • ee8b2aa75d metal: better memory alloc w/ concurrency dispatch lshzh-ww 2023-08-24 00:55:58 -0400
  • 6e91a1b070
    llama : fix grammar sometimes generating null char (#2756) b1046 Evan Jones 2023-08-24 00:07:13 -0400
  • 471e469ae2 pre gguf merge netrunnereve 2023-08-23 23:53:06 -0400
  • d50ccb03a3 manual merge netrunnereve 2023-08-23 23:46:15 -0400
  • 3bf60c5eb3 llama : fix grammar sometimes generating null char Evan Jones 2023-08-23 20:49:47 -0400
  • 47b9f2d36f log_enable/disable, LOG_TEE, basic usage doc staviq 2023-08-24 02:00:16 +0200
  • 463e117820 Simplify vector building logic ochafik 2023-08-23 22:21:07 +0100
  • 44d5462b5c
    readme : fix link Georgi Gerganov 2023-08-23 23:44:19 +0300
  • c7868b0753
    minor : fix trailing whitespace Georgi Gerganov 2023-08-23 23:43:00 +0300
  • 79da24b58c
    readme : update hot topics Georgi Gerganov 2023-08-23 23:41:16 +0300
  • 5132130af7 llama2.c: direct gguf output (WIP) ochafik 2023-08-23 21:08:00 +0100
  • d8beb85c74
    Merge branch 'master' into fix-whitespace Georgi Gerganov 2023-08-23 23:09:35 +0300
  • cf658adc83
    llm : add Falcon support (#2717) master-cf658ad Georgi Gerganov 2023-08-23 23:08:04 +0300
  • fae8faa135
    perplexity : add log for start of tokenization Georgi Gerganov 2023-08-23 22:56:50 +0300
  • 977629a34e
    Merge branch 'master' into fix-eos fix-eos Georgi Gerganov 2023-08-23 22:40:19 +0300
  • 680ab3dcb1
    Merge 6803aac321 into a192860cfe akawrykow 2023-08-23 21:37:57 +0200
  • a192860cfe
    minor : fix trailing whitespace master-a192860 Georgi Gerganov 2023-08-23 22:37:39 +0300
  • 95385241a9
    examples : restore the functionality to import llama2.c models (#2685) master-9538524 Olivier Chafik 2023-08-23 20:33:05 +0100
  • 5afce7939c
    llama : escape all U+2581 in a string Georgi Gerganov 2023-08-23 22:06:16 +0300
  • 8d0dc476c9 llama2.c: convert special-cased "<0xXX>" single byte tokens from tokenizer.bin ochafik 2023-08-23 19:56:16 +0100
  • 630d8b408a
    llama : default special tokens based on vocab type Georgi Gerganov 2023-08-23 21:39:09 +0300
  • 8c6d3939c7 cuda : add TODOs for RoPE NeoX implementation Georgi Gerganov 2023-08-23 21:32:12 +0300
  • 71d05b9ae4 remove atomics and add dynamic log target staviq 2023-08-23 20:27:20 +0200
  • 7df517c797
    update finetune README xaedes 2023-08-23 20:08:48 +0200
  • 1a5f0a30e0
    add command line option --rank-wo N for rank of wo tensor xaedes 2023-08-23 20:00:48 +0200
  • f8ee54bd2c
    llama : revert BPE special-case in llama_byte_to_token() Georgi Gerganov 2023-08-23 20:39:24 +0300
  • 77a3092c83
    update checkpoint train stats before saving via "--save-every" xaedes 2023-08-23 19:34:45 +0200
  • 596e1094fb
    common : remove obsolete BPE API + disable test-tokenizer-1 Georgi Gerganov 2023-08-23 20:31:03 +0300
  • 2424e1d08e
    llama : remove oboslete comment Georgi Gerganov 2023-08-23 20:16:40 +0300
  • 3bfb720642
    llama : advanced BPE tokenizer based on ggllm.cpp imlpementation Georgi Gerganov 2023-08-23 20:11:45 +0300
  • b5184d7274
    Make api_like_OAI.py work with Microsoft Guidance Ryder Wishart 2023-08-23 10:10:57 -0700
  • c3f8a6e49f
    llama : prep new tokenizer support Georgi Gerganov 2023-08-23 19:08:44 +0300
  • 335acd2ffd
    fix convert-lora-to-ggml.py (#2738) slaren 2023-08-23 16:46:54 +0200
  • 5290c38e6e
    main : insert bos if no tokens (#2727) master-5290c38 klosax 2023-08-23 16:46:03 +0200
  • cc34dbda96
    gitignore : fix for windows (#2729) akawrykow 2023-08-23 07:31:34 -0700
  • 7c2227a197
    chmod : make scripts executable (#2675) Cebtenzzre 2023-08-23 10:29:09 -0400
  • f19dca04ea
    devops : RPM Specs (#2723) JohnnyB 2023-08-23 15:28:22 +0100
  • 8263fd7bdb
    Update llama_v3.cpp (#393) askmyteapot 2023-08-24 00:15:48 +1000
  • 004016e6d8
    Update examples/main/main.cpp Georgi Gerganov 2023-08-23 17:12:26 +0300
  • c3c5aacef6
    Update examples/main/main.cpp Georgi Gerganov 2023-08-23 17:11:56 +0300
  • 6938c5f474
    Merge branch 'master' into falcon Georgi Gerganov 2023-08-23 17:08:14 +0300
  • 356a166b19 reverted log auto endline to better mimic printf staviq 2023-08-23 16:06:46 +0200
  • d5156d3345 added basic log file handler staviq 2023-08-23 16:03:30 +0200
  • 727af3ea16 add *.log to .gitignore staviq 2023-08-23 16:02:43 +0200
  • 176ea716b3
    llama : better model naming and size reporting Georgi Gerganov 2023-08-23 15:53:41 +0300
  • e7299656bd
    falcon : add CUDA offloading (#2739) slaren 2023-08-23 14:51:30 +0200
  • 9a13fc4efd Add the short hash back into the tag Danny Daemonic 2023-08-23 05:50:58 -0700
  • 943a248e40 Explain the user that GGML isn't supported anymore Ignacio DM 2023-08-23 09:18:05 -0300
  • 95434613b8 falcon : add CUDA offloading slaren 2023-08-23 14:31:39 +0200
  • 854ae5d030
    metal : temporary workaround for the concurrency optimization bug Georgi Gerganov 2023-08-23 15:25:31 +0300
  • 0a85ae7397
    metal : fix GELU kernel numerical stability by using precise::tanh Georgi Gerganov 2023-08-23 15:04:53 +0300
  • 7935986faa fix convert-lora-to-ggml.py slaren 2023-08-23 13:44:27 +0200
  • b693000c2e
    llama.cpp : fix linefeed token klosax 2023-08-23 13:22:41 +0200
  • bfdc596d58 gguf reader in file format detection Concedo 2023-08-23 19:19:52 +0800
  • 8f7fb69031 Fixed double "$". Use ">>" more consistently. Danny Daemonic 2023-08-23 03:33:31 -0700
  • 8207214b6a
    Fix values shown in the quantize tool help (#2735) master-8207214 Kawrakow 2023-08-23 12:57:12 +0300
  • 62959e740e
    Strided perplexity (#2714) master-62959e7 Kawrakow 2023-08-23 12:56:42 +0300
  • 7f7ddd5002
    Fix ggml to gguf conversion on Windows (#2733) IgnacioFDM 2023-08-23 06:31:09 -0300
  • e2d23bed1b
    falcon : minor changes (still chasing the Metal problem) Georgi Gerganov 2023-08-23 12:25:49 +0300
  • 575c9066a8 Fix ggml to gguf conversion on Windows Ignacio DM 2023-08-23 04:11:44 -0300
  • af170fc2db Merge branch 'master' into concedo_experimental Concedo 2023-08-23 17:08:09 +0800
  • a0dc47a501
    metal : print extra compute pipeline info Georgi Gerganov 2023-08-23 11:25:26 +0300