Commit Graph

  • 546524de6c llama: the method for obtaining information about n_head is included in the public header Michael Podvitskiy 2024-09-16 19:16:49 +0200
  • 544b26640d llama: updated error output for llama_decode_internal and llama_encode_internal Michael Podvitskiy 2024-09-16 18:35:57 +0200
  • a5e87bf438 llama: fixed n_vocab for no_vocab models Michael Podvitskiy 2024-09-16 18:30:28 +0200
  • ed0f2c4ab1 Merge branch 'master' into compilade/convert-separate-extra-tensors Francis Couture-Harpin 2024-09-16 12:01:12 -0400
  • 435b5f9176
    Merge cc1c017191 into 23e0d70bac Georgi Gerganov 2024-09-16 16:51:17 +0100
  • 5d054a42f9 fix(llama.cpp): Use separate switch clause for granite in llm_load_hparams Gabe Goodhart 2024-09-16 09:15:15 -0600
  • 65c5bb91ab fix(convert_hf_to_gguf/gguf-py): _multiplier -> _scale Gabe Goodhart 2024-09-16 08:56:56 -0600
  • 0bdf04e7b5 fix(llama.cpp): Switch Granite param names to use _scale for consistency Gabe Goodhart 2024-09-16 08:55:58 -0600
  • 23e0d70bac
    ggml : move common CPU backend impl to new header (#9509) b3772 slaren 2024-09-16 16:22:07 +0200
  • 13e6d732a0
    py : add XLMRobertaForSequenceClassification [no ci] Georgi Gerganov 2024-09-16 16:59:17 +0300
  • 80863806a3 fix(convert_hf_to_gguf): Use LlamaModel as base for GraniteModel Gabe Goodhart 2024-09-10 09:36:44 -0600
  • e73d795eff fix(llama.cpp): Determine granite language 3b instruct by vocab size Gabe Goodhart 2024-09-09 09:03:09 -0600
  • ec13f29b73 feat(llama.cpp): First pass at full port of granite deviations from llama Gabe Goodhart 2024-09-05 16:43:01 -0600
  • 383065ade6 feat(llama.cpp): Add config parsing for Granite multiplier params Gabe Goodhart 2024-09-05 12:06:34 -0600
  • 406833d779 feat(convert_hf_to_gguf): Add registration and param setup for Granite Gabe Goodhart 2024-09-04 12:16:56 -0600
  • 5ebc5ef572 feat(gguf-py): Add Granite model and params to gguf-py Gabe Goodhart 2024-09-04 12:16:21 -0600
  • 57064fbaee ggml : move common CPU backend impl to new header slaren 2024-09-16 15:17:55 +0200
  • 736e0e6a28 llama.cpp: Add a missing header for cpp23 Yuri Khrustalev 2024-09-16 08:47:49 -0400
  • b38dccc9d1
    Merge 6db4f52d1c into acb2c32c33 Brian 2024-09-16 07:38:42 -0500
  • bcb163bcb5
    Merge 928aa66a92 into acb2c32c33 Brian 2024-09-16 07:37:58 -0500
  • dbd6780445
    Merge 7e492b3e0e into acb2c32c33 Marko Tasic 2024-09-16 14:31:07 +0200
  • a8ddaac942
    Merge 81fca39112 into acb2c32c33 Jiří Podivín 2024-09-16 14:14:20 +0200
  • acb2c32c33
    llama : rename n_embed to n_embd in rwkv6_time_mix (#9504) b3771 Daniel Bevenius 2024-09-16 13:07:13 +0200
  • a6a3a5c531
    ggml : link MATH_LIBRARY not by its full path (#9339) b3770 Michael Podvitskiy 2024-09-16 13:06:50 +0200
  • 603a3f8fb7 ggml: link MATH_LIBRARY not by its full path Michael Podvitskiy 2024-09-06 22:01:42 +0200
  • d54c21df7e
    convert : identify missing model files (#9397) compilade 2024-09-16 03:30:22 -0400
  • 19514d632e
    cmake : do not hide GGML options + rename option (#9465) Georgi Gerganov 2024-09-16 10:27:50 +0300
  • 5c3d0f1824
    ggml : IQ4_NL sgemm + Q4_0 AVX optimization (#9422) b3767 Eve 2024-09-16 06:48:24 +0000
  • 0aadac10c7
    llama : support OLMoE (#9462) b3766 Shane A 2024-09-15 23:47:37 -0700
  • 0bfa0dfa2e llama : rename n_embed to n_embd in rwkv6_time_mix Daniel Bevenius 2024-09-16 08:45:33 +0200
  • 95ca85168b
    llama : support MiniCPM3 (#9322) b3765 CarryFun 2024-09-16 14:45:20 +0800
  • 441b72b91f
    main : option to disable context shift (#9484) b3764 Vinesh Janarthanan 2024-09-16 01:20:01 -0500
  • a67345e9ab
    Merge 1440d445db into c4965a64f7 MasterYi1024 2024-09-16 08:14:06 +0200
  • cc1c017191
    naming : normalize the name of callback-related identifiers gg/cb-naming Georgi Gerganov 2024-09-16 09:11:42 +0300
  • c4965a64f7
    metal : handle zero-sized allocs (#9466) b3763 Georgi Gerganov 2024-09-16 09:05:56 +0300
  • f80e679696
    build : rename flag GGML_CUDA_USE_GRAPHS -> GGML_CUDA_GRAPHS Georgi Gerganov 2024-09-16 09:00:43 +0300
  • 2ac8a91fbe
    cmake : do not hide GGML options Georgi Gerganov 2024-09-13 10:08:55 +0300
  • 169e8a3875 white space VJHack 2024-09-15 21:28:16 -0500
  • 2736688af4 removed server changes VJHack 2024-09-15 21:26:46 -0500
  • 90a2fff0e7
    flake.lock: Update (#9488) Georgi Gerganov 2024-09-16 05:14:23 +0300
  • f5a23928c7 added server example to --no-context-shift args VJHack 2024-09-15 20:57:57 -0500
  • 2d887f0975 Merge branch 'server-disable-context-shift' of https://github.com/VJHack/llama.cpp into server-disable-context-shift VJHack 2024-09-15 20:35:38 -0500
  • c73756ab24 resolve merge conflicts VJHack 2024-09-15 20:35:34 -0500
  • 63f0fa572d
    Update common/arg.cpp Vinesh Janarthanan 2024-09-15 20:35:01 -0500
  • 6262d13e0b
    common : reimplement logging (#9418) b3761 Georgi Gerganov 2024-09-15 20:46:12 +0300
  • e6deac31f7
    gguf-split : add basic checks (#9499) b3760 slaren 2024-09-15 19:02:27 +0200
  • 6988da94a2
    cmake : correct order of sycl flags (#9497) b3759 Michael Podvitskiy 2024-09-15 18:55:52 +0200
  • 721e2b1d8b
    Merge 82755ed08a into 3c7989fd29 Ma Mingfei 2024-09-15 18:45:03 +0200
  • e410850051
    Merge 066996d2eb into 3c7989fd29 Ifeanyi 2024-09-15 10:42:55 -0600
  • 78f3caa88e gguf-split : error when too many arguments are passed slaren 2024-09-15 18:27:55 +0200
  • d3922ac9e8 gguf-split : do not overwrite existing files when merging slaren 2024-09-15 18:24:43 +0200
  • 73ef3f769c
    Update llama-server-intel.Dockerfile sycl-cmake-append Meng, Hengyu 2024-09-15 23:21:46 +0800
  • 4e6035af97 sycl flag should come before the other flags Michael Podvitskiy 2024-09-15 15:47:47 +0200
  • 3956cf92a9
    Update llama-cli-intel.Dockerfile Meng, Hengyu 2024-09-15 23:21:21 +0800
  • af95b1424f
    [SYCL] fix cmake broken Meng, Hengyu 2024-09-15 22:57:56 +0800
  • cf77a846c6 allow disabling context shift in the server VJHack 2024-09-15 09:12:24 -0500
  • 252f3a88ac added null check for llava decode l3utterfly 2024-09-15 21:45:48 +0900
  • db4939040f
    server: add repeat penalty sigmoid ZXED 2024-09-15 15:00:52 +0300
  • f6ae3bb9f2
    Merge 2d79a7077c into 3c7989fd29 compilade 2024-09-15 10:28:36 +0200
  • 3c7989fd29
    py : add "LLaMAForCausalLM" conversion support (#9485) Csaba Kecskemeti 2024-09-15 00:48:25 -0700
  • d6b37c881f
    readme : update tools list (#9475) OSecret 2024-09-15 10:36:53 +0300
  • 7596487beb
    cmake : try to fix sycl+intel build (#9487) b3756 Michael Podvitskiy 2024-09-15 09:06:38 +0200
  • e4ab700f19
    Merge 244811d856 into 822b6322de Dmitry Wolf 2024-09-15 00:05:23 -0700
  • e83d2db931 flake.lock: Update github-actions[bot] 2024-09-15 00:22:31 +0000
  • 70ca91e5c1
    Update README.md OSecret 2024-09-14 23:43:39 +0300
  • e57f508ac5 sycl+intel build fix Michael Podvitskiy 2024-09-14 21:52:37 +0200
  • aaf7f53d46 nvidia uses the LLaMAForCausalLM string in their config.json, example nvidia/Llama3-ChatQA-2-8B Csaba Kecskemeti 2024-09-14 10:48:09 -0700
  • e244300df5 white space VJHack 2024-09-14 11:37:41 -0500
  • 0680710b06 updated README.md for main VJHack 2024-09-14 11:30:10 -0500
  • c52b922d98 reverted precommit VJHack 2024-09-14 11:16:54 -0500
  • 173d4bb336 added cli arg to disable context shift VJHack 2024-09-14 11:15:51 -0500
  • aa9e72158b
    Update clip.cpp Tejaakshaykumar 2024-09-14 18:54:26 +0530
  • 822b6322de
    ggml : ggml_type_name return "NONE" for invalid values (#9458) b3755 Yuri Khrustalev 2024-09-14 05:54:37 -0400
  • dcdcee3a74
    server: add data: [DONE] to /chat/completions stream response (#9459) b3754 VoidIsVoid 2024-09-14 17:36:44 +0800
  • 1f4111e540
    cmake : use list(APPEND ...) instead of set() + dedup linker (#9463) b3753 Georgi Gerganov 2024-09-14 10:55:05 +0300
  • befaf1197f
    llama : make cell_id const in inp_s_mask block (#9470) b3752 Daniel Bevenius 2024-09-14 09:50:12 +0200
  • f83e9c9737 Support MiniCPM3. 范睿凯 2024-09-05 17:48:40 +0800
  • 8241151f16 set context default to avoid memory issue, update guide arthw 2024-09-14 09:01:05 +0800
  • 8f358c4c94 server: add data: [DONE] to /chat/completions stream response VoidIsVoid 2024-09-13 11:09:41 +0800
  • da31f52722
    Added link to proprietary wrapper for Unity3d into README.md OSecret 2024-09-14 00:27:48 +0300
  • 40638f7136
    log : cleanup, comments, build flags Georgi Gerganov 2024-09-13 21:55:11 +0300
  • fb8f142554
    one more CMAKE_CXX_FLAGS fix (#9471) gg/cmake-dedup-link Michael Podvitskiy 2024-09-13 15:13:07 +0200
  • aee23b5462 one more CMAKE_CXX_FLAGS fix Michael Podvitskiy 2024-09-13 15:04:42 +0200
  • feff4aa846
    server : add loading html page while model is loading (#9468) b3751 Xuan Son Nguyen 2024-09-13 14:23:11 +0200
  • 228df2bc11
    cmake : fix sycl build (#9469) Michael Podvitskiy 2024-09-13 14:11:21 +0200
  • 13226dc83e
    log : option to disable the log prefix Georgi Gerganov 2024-09-13 14:48:57 +0300
  • 9dec071bea llama : make cell_id const in inp_s_mask block Daniel Bevenius 2024-09-13 13:48:43 +0200
  • ff3b3809d8
    server : fix verbose check Georgi Gerganov 2024-09-13 14:12:58 +0300
  • 013b6502ba
    Merge branch 'gg/cmake-dedup-link' into sycl-build-fix Georgi Gerganov 2024-09-13 14:22:45 +0300
  • b653b1e922
    cmake : try to fix sycl 2 Georgi Gerganov 2024-09-13 14:05:00 +0300
  • a7feae74e7 also support .html files Xuan Son Nguyen 2024-09-13 12:58:00 +0200
  • 8aa5bb38af use CMAKE_CXX_FLAGS as a string variable Michael Podvitskiy 2024-09-13 12:32:23 +0200
  • 346126bd9e
    Merge 6b45680f21 into 0abc6a2c25 Brian Dashore 2024-09-13 04:24:38 -0600
  • 9eceb1a005 try fix sycl build Michael Podvitskiy 2024-09-13 12:10:20 +0200
  • 519e29a169
    Merge ed4fcf92ff into 0abc6a2c25 Daniel Bevenius 2024-09-13 17:44:02 +0800
  • 0d0dc11185
    server : improve log format Georgi Gerganov 2024-09-13 12:42:35 +0300
  • ae9475de40
    cmake : try fix sycl Georgi Gerganov 2024-09-13 12:41:33 +0300
  • 8f84210df8
    log : add comments + adjust defaults Georgi Gerganov 2024-09-13 12:09:43 +0300
  • 2afe0a0c7d
    examples : move gpt_init() after parsing the cli args Georgi Gerganov 2024-09-13 11:28:20 +0300
  • 078be074a7
    log : print if build is debug [no ci] Georgi Gerganov 2024-09-13 11:15:01 +0300