Commit Graph

  • bbfd39382e
    Zig @cImport("llama.h") requires enum keyword in function signatures Kamil Tomšík 2023-08-14 13:20:31 +0200
  • 797088a7cd
    minor : indentation + assert Georgi Gerganov 2023-08-14 14:10:21 +0300
  • f4a0e0ec5a
    llama : sync gguf-llama.cpp with latest llama.cpp Georgi Gerganov 2023-08-14 13:44:37 +0300
  • 62490f1380
    gguf : use UNIX line ending Georgi Gerganov 2023-08-14 13:04:35 +0300
  • 0c19ae70d5
    simple : minor style changes Georgi Gerganov 2023-08-14 12:56:48 +0300
  • 5c5a95ba2d
    gguf.py : dont add empty strings klosax 2023-08-14 11:22:06 +0200
  • a7d226f871
    convert-llama-h5-to-gguf.py : fixes klosax 2023-08-14 11:14:24 +0200
  • c2c1690568 Merge branch 'master' into server-probs jhen 2023-08-14 17:13:24 +0800
  • e9be24f9ad Fix fp32 fallback if device doesn't support fp16, add force disable env var GGML_VULKAN_DISABLE_F16 0cc4m 2023-08-14 11:07:55 +0200
  • d753dfbcc8
    gptneox-main.cpp : tensor name map changes klosax 2023-08-14 10:59:18 +0200
  • 806a15749d
    Delete gguf_tensor_map.py klosax 2023-08-14 10:57:19 +0200
  • 51939d7d1b
    Create gguf_namemap.py : tensor name map changes klosax 2023-08-14 10:56:59 +0200
  • 5d22a9db13
    convert-gptneox-h5-to-gguf.py : tensor name map changes klosax 2023-08-14 10:55:44 +0200
  • 236194fc3d add more comments Evan Jones 2023-08-14 04:41:50 -0400
  • 1cd06fa25e
    CUDA: launch_bounds, small q4_K, q5_K mmq refactor (#2596) master-1cd06fa Johannes Gäßler 2023-08-14 10:41:22 +0200
  • 2feb8934eb
    server : fix default grammar by use empty string in the UI (#2604) master-2feb893 Jhen-Jie Hong 2023-08-14 16:20:17 +0800
  • 7f828e6b10 Add llama_context_default_params_by_ref to be available to get it from java JNA ostix360 2023-08-14 10:07:49 +0200
  • e950518776 CUDA: launch_bounds, small q4_K, q5_K mmq refactor JohannesGaessler 2023-08-12 17:13:20 +0200
  • 01d22a4a10 Merge upstream changes, fix conflict 0cc4m 2023-08-14 09:47:43 +0200
  • 592ebb044d Transfer remaining shaders to header and compile on runtime 0cc4m 2023-08-14 09:39:58 +0200
  • d2dd8eb1d1 server : default grammar to empty string in the UI jhen 2023-08-14 15:28:17 +0800
  • 70e2f7ca56
    Merge 'origin/master' into hipblas Henri Vasserman 2023-08-14 10:27:18 +0300
  • dbdb2c1353 Merge branch 'master' of github.com:ggerganov/llama.cpp Laura 2023-08-14 09:21:10 +0200
  • 5517d6e692
    server : implement json-schema-to-grammar.mjs & add grammar param in the UI (#2588) master-5517d6e Jhen-Jie Hong 2023-08-14 15:16:54 +0800
  • 56a1f32072
    Merge branch 'master' into gguf Georgi Gerganov 2023-08-14 10:14:05 +0300
  • 196b50fee7 gguf : add todos and comments M. Yusuf Sarıgöz 2023-08-14 08:50:47 +0300
  • f31b539714
    Enhance Windows 7 and below compatibility. (#2592) master-f31b539 vxiiduu 2023-08-14 13:59:16 +1000
  • 4ae761144d server : optimize regex & iteration jhen 2023-08-14 09:20:06 +0800
  • c57302162a server : fix sort of prop pairs jhen 2023-08-14 07:40:41 +0800
  • f41c6254a8 server : generate .hpp jhen 2023-08-14 06:24:12 +0800
  • ee77efea2a
    test : add simple grammar parsing tests (#2594) master-ee77efe drbh 2023-08-13 10:00:48 -0400
  • 24f48833ab fix conflicts M. Yusuf Sarıgöz 2023-08-13 16:55:42 +0300
  • dc29f21481 adds cassert header drbh 2023-08-13 09:25:53 -0400
  • 6beebf3fd9
    gptneox-main.cpp : add file_type key klosax 2023-08-13 14:11:01 +0200
  • 2827b840e4
    convert-gptneox-h5-to-gguf.py : add file_type key klosax 2023-08-13 13:54:10 +0200
  • bf2dad3100 convert : rm quantization version M. Yusuf Sarıgöz 2023-08-13 14:38:53 +0300
  • 1d60468eee fix conflicts M. Yusuf Sarıgöz 2023-08-13 13:35:40 +0300
  • 91d4bfd536 convert : write more metadata for LLaMA M. Yusuf Sarıgöz 2023-08-13 13:29:46 +0300
  • 17800cd80f
    convert-llama-h5-to-gguf.py : load model in parts to save memory klosax 2023-08-13 12:20:02 +0200
  • e3d1f07eb1
    convert-gptneox-h5-to-gguf.py : load model in parts to save memory klosax 2023-08-13 12:18:34 +0200
  • 6d094cd89c server : remove trailing whitespaces jhen 2023-08-13 17:47:54 +0800
  • a47ca7ae7a Add runtime shader compilation, start transferring shaders to this approach 0cc4m 2023-08-13 11:01:27 +0200
  • 9bf5a7efcb
    Update gguf_tensor_map.py klosax 2023-08-13 01:27:38 +0200
  • bffd3cde10 server : remove array check of completion_probabilities in messages jhen 2023-08-13 07:07:40 +0800
  • f64d44a9b9
    CUDA: Fixed OpenLLaMA 3b mmq, reduced compile time (#2590) master-f64d44a Johannes Gäßler 2023-08-13 00:24:45 +0200
  • c7bd8c147c
    gptneox-main.cpp : n_layer --> n_block klosax 2023-08-13 00:03:32 +0200
  • e91a2224e4
    convert-llama-h5-to-gguf.py : n_layer --> n_block klosax 2023-08-13 00:02:44 +0200
  • 489616e126
    convert-gptneox-h5-to-gguf.py : n_layer --> n_block klosax 2023-08-13 00:02:04 +0200
  • d2ce9cfe8d
    gguf.py : n_layer --> n_block klosax 2023-08-13 00:01:20 +0200
  • 8b5f0c5067
    constants.py : n_layer --> n_block klosax 2023-08-13 00:00:32 +0200
  • 43e726d9c0 adds simple grammar parsing tests drbh 2023-08-12 17:53:57 -0400
  • 5e58ffa1ed
    gptneox-main.cpp : n_layer --> n_block klosax 2023-08-12 23:50:58 +0200
  • e606ffeaee
    convert-llama-h5-to-gguf.py : simplify nbytes klosax 2023-08-12 22:30:35 +0200
  • f8218477b3
    convert-gptneox-h5-to-gguf.py : simplify nbytes klosax 2023-08-12 22:29:35 +0200
  • 4cef57c81a
    convert-llama-h5-to-gguf.py : no need to convert tensors twice klosax 2023-08-12 21:50:24 +0200
  • 8f09157ec9
    convert-gptneox-h5-to-gguf.py : no need to convert tensors twice klosax 2023-08-12 21:48:58 +0200
  • 5d81a715d4
    gguf.py : no need to convert tensors twice klosax 2023-08-12 21:45:45 +0200
  • afacdfe83a install ggml-meta.metal if LLAMA_METAL Kolen Cheung 2023-07-29 21:09:10 +0100
  • 60d540831b gguf : roper closing of file M. Yusuf Sarıgöz 2023-08-12 21:42:31 +0300
  • f857820e5d
    Fix MSVC compiler error vxiiduu 2023-08-13 01:09:26 +1000
  • 8847e95725
    clean away unnecessary preprocessor conditional vxiiduu 2023-08-13 00:32:52 +1000
  • b684583f0c
    Update llama-util.h vxiiduu 2023-08-13 00:23:36 +1000
  • 202eab04d3 gguf : quantization is working M. Yusuf Sarıgöz 2023-08-12 16:39:05 +0300
  • 1fc3d30b71 gguf : start implementing quantization (WIP) M. Yusuf Sarıgöz 2023-08-12 16:09:47 +0300
  • fa7c39540c gguf : start implementing quantization (WIP) M. Yusuf Sarıgöz 2023-08-12 15:55:58 +0300
  • ac27ac75ac
    Enhance Windows 7 compatibility. vxiiduu 2023-08-12 22:19:16 +1000
  • 4754bc22b6 Add --cfg-negative-prompt-file option for examples KerfuffleV2 2023-08-12 05:39:02 -0600
  • b2571af255 gguf : start implementing quantization (WIP) M. Yusuf Sarıgöz 2023-08-12 14:28:17 +0300
  • 546aae99a4 CUDA: Fixed OpenLLaMA 3b mmq, reduced compile time JohannesGaessler 2023-08-11 19:59:33 +0200
  • c4f02b4f74 gguf : start implementing quantization (WIP) M. Yusuf Sarıgöz 2023-08-12 12:01:17 +0300
  • 0e1a3c7e7d gguf : start implementing quantization (WIP) M. Yusuf Sarıgöz 2023-08-12 11:32:34 +0300
  • 1132941cb3 Fix descriptor set pre-allocation assert 0cc4m 2023-08-12 10:22:58 +0200
  • 7ac00def7b Remove unnecessary cblas link 0cc4m 2023-08-12 10:22:38 +0200
  • 9483288e03 Merge branch 'master' into concedo_experimental Concedo 2023-08-12 16:04:11 +0800
  • 641561058b
    gfx1100 support Henri Vasserman 2023-08-12 10:51:46 +0300
  • 4fa017a1f9 gguf : start implementing quantization (WIP) M. Yusuf Sarıgöz 2023-08-12 10:40:56 +0300
  • 8ff0398be7 server : generate .hpp jhen 2023-08-12 13:15:41 +0800
  • 186c496fdf Merge branch 'gguf' of https://github.com//ggerganov/llama.cpp into gguf M. Yusuf Sarıgöz 2023-08-12 07:25:10 +0300
  • 2f52008b20 gguf : rm references to old file magics M. Yusuf Sarıgöz 2023-08-12 07:24:46 +0300
  • 3409735cff server : skip byte pair in display probabilites jhen 2023-08-12 12:06:40 +0800
  • 55a86adc24 server : implement grammer param in the UI jhen 2023-08-12 10:41:56 +0800
  • c58ff992dc server : add grammar support in chat.mjs jhen 2023-08-12 10:15:31 +0800
  • 5d79fbcc4d server : implement json-schema-to-grammar.mjs by follow python impl jhen 2023-08-12 10:02:30 +0800
  • a2b16d6172
    Allow for metal development in nix package William Behrens 2023-08-11 20:45:33 -0500
  • b19edd54d5
    Adding support for llama2.c models (#2559) master-b19edd5 byte-6174 2023-08-11 19:17:25 -0400
  • 53dc399472
    server: fixed wrong variable name in timing json (#2579) master-53dc399 Equim 2023-08-12 06:35:14 +0800
  • e76c59d524
    Update gptneox-main.cpp klosax 2023-08-11 23:09:49 +0200
  • 2a5ac7af44
    Update gguf_tensor_map.py klosax 2023-08-11 23:08:48 +0200
  • e732423280 gguf : get rid of n_mult, read n_ff from file M. Yusuf Sarıgöz 2023-08-11 23:50:38 +0300
  • fc60a27642
    ci: add linux binaries to release build ci_cublas_linux-fc60a27 Green Sky 2023-05-05 00:01:30 +0200
  • f44bbd3d88 gguf : rm redundant method M. Yusuf Sarıgöz 2023-08-11 21:00:51 +0300
  • 7009cf581c gguf : shorter name for member variable M. Yusuf Sarıgöz 2023-08-11 20:43:02 +0300
  • 61919c1a8f gguf : rm references to old file formats M. Yusuf Sarıgöz 2023-08-11 20:36:11 +0300
  • d09fd10713 gguf : write metadata in gguf_file_saver M. Yusuf Sarıgöz 2023-08-11 20:07:43 +0300
  • 781b9ec3f5 gguf : write metadata in gguf_file_saver (WIP) M. Yusuf Sarıgöz 2023-08-11 18:01:26 +0300
  • 28abfc90fa gguf : write metadata in gguf_file_saver (WIP) M. Yusuf Sarıgöz 2023-08-11 13:27:58 +0300
  • e3a4960953 gguf : add gguf_get_kv_type M. Yusuf Sarıgöz 2023-08-11 13:03:23 +0300
  • eb8ca6996f gguf : add gguf_get_kv_type M. Yusuf Sarıgöz 2023-08-11 12:24:08 +0300
  • b2440f1943 gguf : start implementing gguf_file_saver (WIP) M. Yusuf Sarıgöz 2023-08-11 11:29:50 +0300
  • a356b0e228 gguf : start implementing gguf_file_saver (WIP) M. Yusuf Sarıgöz 2023-08-11 10:50:02 +0300