Commit Graph

  • b41b6e679f opencl: fix MSVC builds (string length error) Max Krasnyansky 2024-12-12 22:03:27 -0800
  • b25a4caaf4 opencl: fail gracefully if opencl devices are not available Max Krasnyansky 2024-12-12 14:51:08 -0800
  • c971a1885d opencl: fix compiler warnings with GCC and Clang Max Krasnyansky 2024-12-12 12:31:55 -0800
  • 3bc085b359 opencl: use pools for tensor_extra Li He 2024-12-11 23:19:52 -0800
  • 74a9bafcb9 opencl: remove limits on tensor_extra Li He 2024-12-11 21:46:03 -0800
  • 70063c6c0c opencl: replace some more OPENCL2 leftovers Max Krasnyansky 2024-12-11 21:38:24 -0800
  • c64ef0fb5c opencl: remove copyright marker since main license already covers Li He 2024-12-11 15:15:46 -0800
  • e447dbcc01 opencl: rename backend - funcs, structs, etc opencl2 -> opencl Li He 2024-12-11 14:48:26 -0800
  • 22411ab58f opencl: make OpenCL required, remove redundant lib and inc directories Li He 2024-12-11 14:07:36 -0800
  • 97a12703dd opencl: rename kernel files ggml-opencl2 -> ggml-opencl Li He 2024-12-10 23:24:34 -0800
  • 34f2fc15ea opencl: rename backend opencl2 -> opencl Li He 2024-12-10 22:17:24 -0800
  • e9a97381f2 opencl: use GGML_LOG_xxx instead of fprintf(stderr, ...) Li He 2024-12-10 20:42:15 -0800
  • 9a9d92b0b9 opencl: use cl_ulong for sizes and strides Max Krasnyansky 2024-12-07 18:02:15 -0800
  • c21fc8c5f9 opencl: use cl_ulong for all offsets Max Krasnyansky 2024-12-07 17:44:42 -0800
  • 31f305ea01 opencl: use ulong for offsets and strides in ADD kernel Max Krasnyansky 2024-12-07 17:35:26 -0800
  • 0451edd936 opencl: cleanup ggml-opencl2 header file Max Krasnyansky 2024-12-07 16:49:01 -0800
  • 66d4330377 opencl: Clean up small-alloc in CMake files Li He 2024-11-28 23:05:51 -0800
  • 969a00a4b9 opencl: CI workflow fixes Max Krasnyansky 2024-11-28 16:37:03 -0800
  • 4bca601be6 opencl: fix embed tool invocation with python3 Max Krasnyansky 2024-11-28 16:24:44 -0800
  • 9b6540b6f9 opencl-ci: use RUNNER_TEMP instead of github.workspace Max Krasnyansky 2024-11-28 15:57:36 -0800
  • d24b360255 opencl: fixed merge conflict (MUSA added twice in cmake) Max Krasnyansky 2024-11-28 15:37:54 -0800
  • 671c7af6b9 opencl: remove small-alloc support and fix build errors for non-opencl platforms Max Krasnyansky 2024-11-28 15:29:08 -0800
  • 8ad0bb30df opencl: integrate backend dyn.load interface and fix compiler and format warnings Max Krasnyansky 2024-11-27 18:09:05 -0800
  • c1af4b72b7 [cl][adreno] Fix memory leak for non SMALL_ALLOC path Li He 2024-11-26 22:57:42 -0800
  • 3571bb6c63 [cl][ci] Add workflow for CL Li 2024-11-26 13:54:10 -0800
  • f56fb699bc [cl][adreno] Add Adreno GPU support Li He 2024-11-25 11:27:15 -0800
  • 42c44893a6 Update server JSON response. MichelleTPY 2024-12-13 18:39:33 +0000
  • c27ac678dd
    Opt class for positional argument handling (#10508) b4324 Eric Curtin 2024-12-13 18:34:25 +0000
  • 7555ab1cdc convert-hf : use GPT2 vocab and ignore sliding_window hparam for Phi-4 model llama : use regular (not a sliding window) attention mask for Phi-4 model Stanisław Szymczyk 2024-12-13 18:55:08 +0100
  • 4710c27e06 Opt class for positional argument handling Eric Curtin 2024-11-25 21:13:36 -0500
  • 11e07fd63b
    fix: graceful shutdown for Docker images (#10815) Corentin REGAL 2024-12-13 18:23:50 +0100
  • 9fc6adeb24 fix gzip non deterministic Xuan Son Nguyen 2024-12-13 18:19:12 +0100
  • bd5e39fc0e use better maintained @vscode/markdown-it-katex Xuan Son Nguyen 2024-12-13 16:42:28 +0100
  • 43e8aa2912 fix: graceful shutdown for Docker images Corentin REGAL 2024-12-13 16:39:21 +0100
  • b3d2b06008 Revert "remove katex" Xuan Son Nguyen 2024-12-13 16:32:14 +0100
  • 57b96e8d61 use gzip Xuan Son Nguyen 2024-12-13 16:30:43 +0100
  • 19aba1dbbe update ggml_backend_*_supports_op of unsupported backends HimariO 2024-12-13 23:16:57 +0800
  • 4601a8bb67
    gguf-py : numpy 2 newbyteorder fix (#9772) Jett Janiak 2024-12-13 15:48:44 +0100
  • 630ddcc570 update docs Xuan Son Nguyen 2024-12-13 14:35:51 +0100
  • 262950db25 Merge branch 'master' into xsn/fix_logprobs Xuan Son Nguyen 2024-12-13 14:34:53 +0100
  • 196e237e09 add --multi-token-probs Xuan Son Nguyen 2024-12-13 14:33:44 +0100
  • a02a190756 minor updates HimariO 2024-12-13 21:31:51 +0800
  • 9f35e44592
    Fix crash caused by ggml_backend_load_all when launching on Android Activity (#10812) b4321 谢乃闻 2024-12-13 12:56:07 +0000
  • 455bcc8474 reuse some langs Xuan Son Nguyen 2024-12-13 12:32:17 +0100
  • e2e9a6caf7 update position id tensor size check in GGML_OP_ROPE HimariO 2024-12-13 19:31:34 +0800
  • a8a619ba47 add php Xuan Son Nguyen 2024-12-13 12:28:39 +0100
  • e423d48a2f add more languages Xuan Son Nguyen 2024-12-13 12:26:50 +0100
  • 41f04470d0 add bundle size check Xuan Son Nguyen 2024-12-13 12:22:07 +0100
  • c0e5046acc remove katex Xuan Son Nguyen 2024-12-13 12:16:26 +0100
  • ef7f74bddd store llama_hparams.rope_sections with fixed size array HimariO 2024-12-13 19:03:36 +0800
  • e9748e4aa0 fix traililng whitespce HimariO 2024-12-13 18:30:48 +0800
  • 95842c19f1 highlight: smaller bundle size, fix light & dark theme Xuan Son Nguyen 2024-12-13 11:16:42 +0100
  • c292bf1d11
    Merge branch 'ggerganov:master' into qwen2-vl HimariO 2024-12-13 18:12:53 +0800
  • 9abb25278a
    Apply suggestions from code review HimariO 2024-12-13 18:11:08 +0800
  • 60bbd4ebf1 Apply code format changes Molly Sophia 2024-12-13 17:43:08 +0800
  • 77fe4fd982 RWKV_WKV6 Vulkan op tests passed Molly Sophia 2024-12-13 17:19:23 +0800
  • 4651f5e2f2 rwkv_wkv6 vulkan shader Zhiyuan Li 2024-11-02 01:45:27 +1100
  • 64ae065511
    vulkan: small mul_mat_vec optimizations (#10665) b4320 Eve 2024-12-13 08:42:04 +0000
  • 06bb38e75d update docs Xuan Son Nguyen 2024-12-13 08:50:47 +0100
  • 20b47d4d94
    Also double the number of rows for Intel GPUs 0cc4m 2024-12-13 08:13:15 +0100
  • 83ed24a97b
    SYCL: Reduce most of the compiler warnings (#10748) b4319 Akarshan Biswas 2024-12-13 12:12:15 +0530
  • 1aa26d783a set min and max subgroup size in any case Eve 2024-12-12 20:25:22 -0500
  • 1f032a9577
    Update ggml/src/ggml-backend-reg.cpp Diego Devesa 2024-12-13 01:07:03 +0100
  • d583cd03f6
    ggml : Fix compilation issues on ARM platform when building without fp16 (#10811) b4318 Karol Kontny 2024-12-13 01:04:19 +0100
  • f17d2c721e Fix crash caused by ggml_backend_load_all when launching on AndroidActivity. NAIWENXIE\Naiwen 2024-12-12 23:55:39 +0000
  • c6f5488eaf ggml: Fix compilation issues on ARM platform when building without fp16 Karol Kontny 2024-12-12 23:12:08 +0100
  • b83e9a6cd2 fix: Remove unused LLM_KV_ATTENTION_LAYER_COUNT Gabe Goodhart 2024-12-12 15:02:38 -0700
  • 97e6ba8d99 fix: Remove outdated TODO in convrsion script Gabe Goodhart 2024-12-12 15:02:05 -0700
  • adffa6ffd5
    common : improve -ctv -ctk CLI arguments (#10806) b4317 Xuan Son Nguyen 2024-12-12 22:53:05 +0100
  • 8b13f2d005 fix conflict Eve 2024-12-12 16:44:31 -0500
  • d9c6bf16d4 manual merge ggml-vulkan.cpp Eve 2024-12-12 16:42:46 -0500
  • d0fbab4710 style fixes VJHack 2024-12-12 15:37:40 -0600
  • 661302278e fixed coding style VJHack 2024-12-12 15:31:13 -0600
  • 8f8f3fc61c fixed coding style VJHack 2024-12-12 15:28:05 -0600
  • 3fec232f82 use std::vector Xuan Son Nguyen 2024-12-12 22:27:21 +0100
  • ef0fc843ba rebuild public/index.html VJHack 2024-12-12 15:23:59 -0600
  • 304e1e5860 build public/index.html VJHack 2024-12-12 14:46:06 -0600
  • 42a9c5a948 even better approach Xuan Son Nguyen 2024-12-12 21:27:53 +0100
  • 5622ff5600 code cleanup VJHack 2024-12-12 14:21:03 -0600
  • 6b3696013c regenerate docs Xuan Son Nguyen 2024-12-12 21:14:45 +0100
  • 27830711fd add code highlighting and math formatting VJHack 2024-12-12 14:14:18 -0600
  • ba35f29e81 common : improve ctv ctk cli argument Xuan Son Nguyen 2024-12-12 21:09:22 +0100
  • 274ec65af6
    contrib : add ngxson as codeowner (#10804) Xuan Son Nguyen 2024-12-12 20:52:28 +0100
  • 92a9392fcd Updated arg.cpp instead of auto-generated README.md MichelleTPY 2024-12-12 19:32:39 +0000
  • 204e78fba1 fix: A number of places where hybrid needs to be handled Gabe Goodhart 2024-12-10 15:34:53 -0700
  • 4543ed5640 feat: Update the logic in llama_decode_internal for kv_hybrid cache Gabe Goodhart 2024-12-10 11:00:01 -0700
  • 44bf431ab4 fix: Only allocate kv cache tensors for the appropriate layers in hybrid models Gabe Goodhart 2024-12-10 10:48:48 -0700
  • 92653d05fd WIP: Partial work towards separate hybrid cache Gabe Goodhart 2024-12-09 13:42:54 -0700
  • d3a34e0282 fix: per-layer recurrent embd_[kv]_s Gabe Goodhart 2024-12-09 15:51:32 -0700
  • f2478bcab5 fix: Get n_head_kv per-layer in build_bamba Gabe Goodhart 2024-12-09 13:43:27 -0700
  • e7b1abbc0a feat(bamba): Partially complete work on constructing the forward graph Gabe Goodhart 2024-12-05 11:04:54 -0700
  • 41fc019057 fix(bamba): Remove ssm_head_count and ssm_chunk_size in llama.cpp Gabe Goodhart 2024-12-05 11:01:02 -0700
  • dfe8d3ddb8 fix(bamba conv): Remove chunk size and consolidate head count w/ time step rank Gabe Goodhart 2024-12-05 10:59:21 -0700
  • 3ee0ae3b90 feat(bamba): Full tensor parsing for bamba Gabe Goodhart 2024-12-04 12:01:45 -0700
  • fd3bb30118 fix(bamba conv): Fizes in tensor name and hparam conversion for llama.cpp parsing Gabe Goodhart 2024-12-04 12:00:46 -0700
  • e0af809b05 feat(bamba): hparam parsing in llama.cpp Gabe Goodhart 2024-12-03 16:29:32 -0700
  • 1c1e0080ed fix(bamba): Jamba->Bamba in llama.cpp Gabe Goodhart 2024-12-03 16:29:13 -0700
  • fd98682ec3 fix(bamba conv): Jamba -> Bamba Gabe Goodhart 2024-12-03 16:27:29 -0700
  • e3525e9e50 feat(convert): Full pass at hparam conversion Gabe Goodhart 2024-12-02 16:27:19 -0700
  • 246dfdba65 feat(jamba): Add jamba architecture to llama.cpp enums Gabe Goodhart 2024-11-26 14:30:12 -0700