Commit Graph

  • 62f4926bde fix : fix errors upd writing example M. Yusuf Sarıgöz 2023-07-28 00:04:19 +0300
  • 7c529cede6
    convert.py : Update to support 70B HF format model files (#2427) mj-shifu 2023-07-27 22:39:17 +0200
  • 9411250564 refactor : rm unused import and upd todos M. Yusuf Sarıgöz 2023-07-27 23:25:47 +0300
  • bb54d1700e GGUF : Support writing tensors in Python M. Yusuf Sarıgöz 2023-07-27 23:09:53 +0300
  • 464192b9be WIP: Write tensor M. Yusuf Sarıgöz 2023-07-27 22:25:04 +0300
  • 9442c34f49 convert.py : shorten and simplify permute Maximilian Markewitz 2023-07-27 20:59:43 +0200
  • 01d16e1a1e convert.py : fix of type and shorter code Maximilian Markewitz 2023-07-27 20:03:43 +0200
  • e15a67d6b2 convert.py : fix llama 2 70b conversion from Huggingface Maximilian Markewitz 2023-07-27 19:16:58 +0200
  • 966c069b3f llama.cpp : fix embeddings input slaren 2023-07-27 19:03:31 +0200
  • ba0ab56b63 llama.cpp : fix embeddings output slaren 2023-07-27 18:54:06 +0200
  • e592a17a75 ggml : refactor ggml_view_Nd into ggml_view_tensor_offset slaren 2023-07-27 18:40:52 +0200
  • e39e62ba4a replace n_views and n_children in ggml_tensor with a hash table in the allocator slaren 2023-07-27 18:34:21 +0200
  • af7bd42b2a llama.cpp : free allocator when deleting context, cleanup slaren 2023-07-27 18:02:53 +0200
  • 64584d56a7 ggml : don't calculate data pointer of unallocated tensors when creating a view with an offset slaren 2023-07-27 17:46:05 +0200
  • f67179aaf2 add list of ops that support in-place slaren 2023-07-27 16:11:32 +0200
  • 8fa548377a allow using the allocator with opencl slaren 2023-07-27 12:18:03 +0200
  • 8afe392398 fix mpi build slaren 2023-07-27 12:15:49 +0200
  • 598a9ada8f adjust buffer size to account for alignment slaren 2023-07-27 12:14:51 +0200
  • 768ecfcc28 ggml : add graph tensor allocator slaren 2023-07-26 17:13:58 +0200
  • ca4650afdb
    common.cpp : Change default param klosax 2023-07-27 16:48:53 +0200
  • 90b2ce3549
    common.h : change default param value klosax 2023-07-27 16:46:18 +0200
  • 01bdda2574
    Update index.html JackJollimore 2023-07-27 11:35:17 -0300
  • d2bb3ac10b
    convert.py : remove GGML vocab + other obsolete stuff Georgi Gerganov 2023-07-27 16:36:35 +0300
  • 68f53485e4
    convert.py : start a new simplified implementation by removing old stuff Georgi Gerganov 2023-07-27 15:56:53 +0300
  • 158be8f7f4
    gguf.py : some code style changes Georgi Gerganov 2023-07-27 15:37:06 +0300
  • d2b6ca13ad
    gguf : add array support Georgi Gerganov 2023-07-27 14:53:07 +0300
  • a55b102e21 Update index.html.hpp by running ./deps.sh Ebrahim Byagowi 2023-07-27 13:36:37 +0330
  • 6500d95af4
    Merge 5f04a5d877 into 1a941869cb Howard Su 2023-07-27 09:11:44 +0000
  • efb5dac337 supporting more diverse tokenizers eric8607242 2023-07-27 16:59:42 +0800
  • d89533dff6
    gguf : expose the gguf_type enum through the API for now Georgi Gerganov 2023-07-27 11:10:34 +0300
  • 1a941869cb
    metal : disable graph concurrency optimization due to bug (#2413) master-1a94186 Georgi Gerganov 2023-07-27 11:00:54 +0300
  • a6c25ebf3e supporting more diverse tokenizers eric.huang 2023-07-27 15:37:14 +0800
  • af1c9966c8 gguf : start write tensor info gguf-python M. Yusuf Sarıgöz 2023-07-27 10:32:31 +0300
  • c85d3178b3
    refactor : reduce code duplication and better API (#2415) M. Yusuf Sarıgöz 2023-07-27 10:29:29 +0300
  • 1038d1d2bc
    metal : fix out-of-bounds access + style changes Georgi Gerganov 2023-07-27 10:10:51 +0300
  • 8332d26123 refactor: reduce code duplication and better API M. Yusuf Sarıgöz 2023-07-27 09:48:08 +0300
  • 2855ffa7f4 server : Support dark mode Ebrahim Byagowi 2023-07-27 09:44:47 +0330
  • 5ef33fbd5c complete JSON grammar Evan Jones 2023-07-26 21:15:26 -0400
  • ffb8c87caa Merge remote-tracking branch 'upstream/master' into json-schema Evan Jones 2023-07-26 20:46:52 -0400
  • a0f564ff4a Merge remote-tracking branch 'origin/master' into prompt-array Xiao-Yong Jin 2023-07-26 17:43:08 -0500
  • bb3770b3e6 server: tokenize endpoint no longer adds BOS Xiao-Yong Jin 2023-07-26 17:42:20 -0500
  • b5472ea0ad
    ggml : fix assert in ggml_set_unary_op (#2410) master-b5472ea slaren 2023-07-26 23:57:23 +0200
  • f9c3a3fd60 ggml : fix assert in ggml_set_unary_op slaren 2023-07-26 22:04:29 +0200
  • d8491fc7e3
    gguf : add comments Georgi Gerganov 2023-07-26 22:56:26 +0300
  • 5628ec7163
    gguf : read / write sample models Georgi Gerganov 2023-07-26 20:04:22 +0300
  • e9c17039db Create example bash script for LlaMa 2 Chat lionelchg 2023-07-26 21:31:30 +0200
  • 6df1f5940f
    make : build with -Wmissing-prototypes (#2394) master-6df1f59 Cebtenzzre 2023-07-26 14:00:04 -0400
  • cddfec9ff2 Create bash script for LlaMa 2 Chat models lionelchg 2023-07-26 19:16:07 +0200
  • 01814b6014
    fix:workaround for missing _mm256_setr_m128i in GCC < 8 in new-added k_quants.c Lee 2023-07-27 00:48:51 +0800
  • e46870f5af
    gguf : gguf.c is now part of ggml.c Georgi Gerganov 2023-07-26 18:55:32 +0300
  • d313c0fa33
    gguf : simplify gguf_get_val Georgi Gerganov 2023-07-26 18:53:57 +0300
  • cb871fa022
    gguf : do not support passing existing ggml_context to gguf_init Georgi Gerganov 2023-07-26 18:48:52 +0300
  • 860c9c63ce
    gguf : add gguf_get_tensor_name() Georgi Gerganov 2023-07-26 16:36:03 +0300
  • 78b226a959
    gguf : initial model loading - not tested Georgi Gerganov 2023-07-26 16:32:05 +0300
  • d91b985d2d
    gguf : read tensor info Georgi Gerganov 2023-07-26 14:58:35 +0300
  • 8d6acfec12
    gguf : read header + meta data Georgi Gerganov 2023-07-26 14:33:53 +0300
  • 6873148771
    gguf : first API pass Georgi Gerganov 2023-07-26 13:24:20 +0300
  • 7e82d25f40
    ci : disable CI temporary to not waste energy Georgi Gerganov 2023-07-26 11:26:14 +0300
  • bae6b125f6
    wip : implement GGUF (#2397) M. Yusuf Sarıgöz 2023-07-26 11:17:05 +0300
  • 4d698495ea
    gguf : init Georgi Gerganov 2023-07-26 11:16:07 +0300
  • 5488fb789e
    ggml : allocate graphs in a context (#2392) master-5488fb7 slaren 2023-07-26 15:56:53 +0200
  • 0509a68016 Adding newline to eof Stephen Nichols 2023-07-26 07:14:44 -0500
  • 1b4fd4e0d9 cleanup slaren 2023-07-26 13:10:41 +0200
  • e5055f0971 fix: remove the unnecessary last \n nhamanasu 2023-07-26 19:49:51 +0900
  • 7949dcaaf7 add GGML_PAD slaren 2023-07-26 12:49:18 +0200
  • 5c74eb0b2e add: server chat mode with llama2 nhamanasu 2023-07-26 19:19:36 +0900
  • 156d99abde cleanup slaren 2023-07-26 11:48:20 +0200
  • 5c19dd3eef
    Update ggml.c slaren 2023-07-26 11:34:17 +0200
  • fb9a06773c WIP: python class to write GGUF, incomplete C apı for reading M. Yusuf Sarıgöz 2023-07-26 10:57:03 +0300
  • bee2a3d981 Add docs Make fschat and flask-cors optional Elsa 2023-07-25 18:16:31 +0800
  • ea5a7fbc95 Use coversation template from fastchat for api proxy Fix eventsource format Elsa 2023-07-25 18:07:27 +0800
  • de41d5ecd8 Fix static declarations goerch 2023-07-26 08:30:25 +0200
  • 94e0a06daf updated lite, up ver (+1 squashed commits) Concedo 2023-07-26 10:35:53 +0800
  • b184380aae Revert "a better default rms_norm_eps" Concedo 2023-07-26 10:23:45 +0800
  • f53d2aabb4 Merge branch 'master' into concedo_experimental Concedo 2023-07-26 10:19:59 +0800
  • 97110251b9 fix mpi slaren 2023-07-26 01:43:53 +0200
  • 78ef528958 make : build with -Wmissing-prototypes Cebtenzzre 2023-07-25 18:42:18 -0400
  • a9019963a1 WIP: super not working attempt atm. will update as I learn more ggml :D Aniket 2023-07-25 16:37:38 -0400
  • 78f8e4d604 add the new example directory in gitignore Aniket 2023-07-25 16:36:39 -0400
  • 77d662faa5 llama.cpp : allocate graph in the context slaren 2023-07-25 20:36:32 +0200
  • 567b5e24ed allocate work buffer as a ggml_object in ggml_graph_compute_with_ctx slaren 2023-07-25 20:35:59 +0200
  • 59e808b49b ggml : graph allocation in contexts slaren 2023-07-25 20:29:02 +0200
  • 3811c0a505 Reverting assert edits. Stephen Nichols 2023-07-25 12:14:21 -0500
  • 1b2ec1aa72 Move to graph function similar to CUDA implementation 0cc4m 2023-07-25 19:01:28 +0200
  • 3e3f38af48 Fixing race condition in server.cpp and partial stream handling in completion.js Stephen Nichols 2023-07-25 11:57:29 -0500
  • b4a5461ff8 Resolve merge conflict with grammar stuff. goerch 2023-07-25 18:14:38 +0200
  • 3bdf106e06
    Merge branch 'master' into fix-2023 goerch 2023-07-25 17:59:13 +0200
  • 5e7a26628b
    Merge branch 'ggerganov:master' into hellaswag_scores klosax 2023-07-25 17:58:04 +0200
  • fae04ddd97
    perplexity.cpp : clean up klosax 2023-07-25 17:57:15 +0200
  • e68580f993 Remove llama.cpp.h goerch 2023-07-25 17:49:24 +0200
  • ae4d116bdf
    perplexity.cpp : add hellswag scores / remove perplexity-lines klosax 2023-07-25 17:43:34 +0200
  • a40f608249
    common.cpp : add hellaswag / remove perplexity-lines klosax 2023-07-25 17:37:00 +0200
  • eb542d3932
    Add LLAMA_DEFAULT_RMS_EPS so we can change the default (#2384) master-eb542d3 Kawrakow 2023-07-25 18:35:53 +0300
  • 522a29c426
    common.h : add hellaswag / remove perplexity-lines klosax 2023-07-25 17:31:43 +0200
  • 6a054b80b0 Merge branch 'master' into concedo_experimental Concedo 2023-07-25 22:55:55 +0800
  • 0c26799e77 a better default rms_norm_eps Concedo 2023-07-25 22:51:01 +0800
  • 07aaa0f63f
    ggml : fix ggml_flash_attn to use op_params (#2387) master-07aaa0f slaren 2023-07-25 16:20:12 +0200
  • e25e15c9c5 fix slaren 2023-07-25 15:50:57 +0200
  • 8a927cf487 ggml : fix ggml_flash_attn to use op_params slaren 2023-07-25 15:43:46 +0200
  • fce48caf9a
    convert.py : support bpe tokenizer (#2228) ldwang 2023-07-25 21:22:09 +0800