Commit Graph

  • 435dfa33b8 fallback to clock seed if std::random_device is not a true RNG slaren 2024-09-10 17:33:10 +0200
  • 246a088e41 replace with llama_supports_gpu_offload Xuan Son Nguyen 2024-09-10 17:24:22 +0200
  • cae7e22d88 arg : bring back missing ifdef Xuan Son Nguyen 2024-09-10 17:11:04 +0200
  • 4516c75baa use mt19937::seed() in reset slaren 2024-09-10 16:10:40 +0200
  • c8a3f291fe
    ggml : hide ggml_object, ggml_cgraph, ggml_hash_set Georgi Gerganov 2024-09-10 16:38:06 +0300
  • dbae51004a remove outdated comment slaren 2024-09-10 15:31:44 +0200
  • adf3bce13b vulkan : do not use tensor->extra Radoslav Gerganov 2024-09-10 14:09:43 +0300
  • ee97254154 Added TODO to remind supporting FP32 im2col Alberto Cabrera 2024-09-10 11:41:02 +0100
  • fd46535314
    Update src/llama.cpp Georgi Gerganov 2024-09-10 13:03:54 +0300
  • 6cce78c2ed
    Merge branch 'master' into gg/llama-perf Georgi Gerganov 2024-09-10 11:38:03 +0300
  • b94078dba7
    Merge branch 'master' into gg/dead-code Georgi Gerganov 2024-09-10 11:31:58 +0300
  • 406d475837
    Merge 593627a8b1 into 00ba2ff781 pomoke 2024-09-10 16:24:49 +0800
  • 00ba2ff781
    metal : fix compile warning with GGML_METAL_NDEBUG (#0) b3720 Georgi Gerganov 2024-09-10 10:17:03 +0300
  • 83008b7cfe
    llama : update llm_build_copy_mask_state comment [no ci] (#9385) Daniel Bevenius 2024-09-10 09:03:21 +0200
  • 0b4ac75772
    RWKV v6: Add time_mix_decay_w1/w2 in quant exclusion list (#9387) b3718 Molly Sophia 2024-09-10 15:02:30 +0800
  • 1e8646b3e8 feat: support internvl qianlangyu 2024-09-10 14:39:00 +0800
  • fb3f249815
    make : do not run llama-gen-docs when building (#9399) b3717 slaren 2024-09-10 08:23:33 +0200
  • dfe31e0484 Adding loading page for '/' server requests VJHack 2024-09-09 22:06:56 -0500
  • 6b780d82ab squashed Eve 2024-09-08 16:10:25 -0400
  • 2217247051 imatrix : remove unused n_entries Francis Couture-Harpin 2024-09-09 22:35:47 -0400
  • efa9186dc8 imatrix : avoid using designated initializers in C++ Francis Couture-Harpin 2024-09-09 22:33:10 -0400
  • 490da45f54 feat: Add host buffer type for Ascend NPU(CANN backend) dou 2024-09-10 10:24:43 +0800
  • 894ed8d7b6 py : include imatrix converter requirements in toplevel requirements Francis Couture-Harpin 2024-09-09 22:20:18 -0400
  • 9e6b0e9419 perplexity : revert changes Francis Couture-Harpin 2024-09-09 22:00:37 -0400
  • 503630e88a py : add requirements for legacy imatrix convert script Francis Couture-Harpin 2024-09-09 21:56:04 -0400
  • c50293c348 make : do not run llama-gen-docs when building slaren 2024-09-10 03:46:38 +0200
  • 1c57c3a54a llama : move random seed generation to the samplers slaren 2024-09-10 03:41:16 +0200
  • 94596be679 convert : identify missing model files Francis Couture-Harpin 2024-09-09 19:36:31 -0400
  • 250df0e909 llama_sampler_penalties : clamp penalty_last_n to zero slaren 2024-09-10 02:49:51 +0200
  • 141dd55e53 convert : refactor rope_freqs generation Francis Couture-Harpin 2024-09-08 20:01:13 -0400
  • bb28e92e18
    Merge 6ee4d4c1f2 into bfe76d4a17 Jeff Price 2024-09-09 18:01:07 -0500
  • 1d5a2df1ef sycl : update support condition to im2col Alberto Cabrera 2024-09-09 22:52:53 +0100
  • bfe76d4a17
    common : move arg parser code to arg.cpp (#9388) b3716 Xuan Son Nguyen 2024-09-09 23:36:09 +0200
  • b3a4218aa9
    Merge c882647e7c into 293bebe077 Pavel Fatin 2024-09-09 21:34:09 +0100
  • decde48be7 fix test Xuan Son Nguyen 2024-09-09 20:31:52 +0200
  • 42d5fc1986 update server readme Xuan Son Nguyen 2024-09-09 20:25:53 +0200
  • 96311e3248 refactor gpt_params_parse Xuan Son Nguyen 2024-09-09 20:23:45 +0200
  • cf2a874142 fix build Xuan Son Nguyen 2024-09-09 19:02:40 +0200
  • 293bebe077
    rpc : fix segfault with nkvo (#9389) b3715 Radoslav Gerganov 2024-09-09 18:40:10 +0300
  • 5fac4d5764
    ggml : vector length agnostic SVE support (#9290) b3714 Prashant Vithule 2024-09-09 21:07:18 +0530
  • bb689e1d82
    Update ggml/src/ggml-quants.c Prashant Vithule 2024-09-09 21:02:43 +0530
  • 6412a598a1
    common : more explicit includes Georgi Gerganov 2024-09-09 18:22:25 +0300
  • 5fb5e24811
    llama : minor sampling refactor (2) (#9386) b3713 slaren 2024-09-09 17:10:46 +0200
  • 3e03807043 missing cstdarg Xuan Son Nguyen 2024-09-09 16:26:39 +0200
  • fa00ec0e59 missing climits Xuan Son Nguyen 2024-09-09 16:07:58 +0200
  • 2fd513a826 match discrete_dist type and function return type slaren 2024-09-09 16:06:05 +0200
  • fe16c7a8ad fix type specifier in format string slaren 2024-09-09 15:58:03 +0200
  • 8f5f25cd72 rpc : buf_size must not be static Radoslav Gerganov 2024-09-09 16:57:47 +0300
  • 30f06726e7 add cmake Xuan Son Nguyen 2024-09-09 15:56:25 +0200
  • 6e801df136 rpc : fix nkvo slaren 2024-09-07 03:24:47 +0200
  • 9ea1e93591 better categorize args Xuan Son Nguyen 2024-09-09 15:44:48 +0200
  • 9444f3fca2 RWKV v6: Add time_mix_decay_w1/w2 in quant exclusion list Molly Sophia 2024-09-09 21:31:04 +0800
  • 4f7b808ba7 llama : minor sampling refactor (2) slaren 2024-09-09 15:27:07 +0200
  • 5d399f5689 common : move arg parser to arg.cpp Xuan Son Nguyen 2024-09-09 15:17:58 +0200
  • bfeb2f5b3b llama : update llm_build_copy_mask_state comment Daniel Bevenius 2024-09-09 15:04:34 +0200
  • 2bed2542ba
    Merge pull request #1 from ggerganov/SVE-vector-length-agnostic-VLA-gg Prashant Vithule 2024-09-09 18:22:16 +0530
  • 38ca6f644b
    readme : update hot topics Georgi Gerganov 2024-09-09 15:51:37 +0300
  • 8e6e2fbe14
    CUDA: fix variable name conflict for Windows build (#9382) b3711 Johannes Gäßler 2024-09-09 14:22:53 +0200
  • 8d954a8629 CUDA: fix variable name conflict for Windows build Johannes Gäßler 2024-09-09 13:28:40 +0200
  • 5ed087573e
    readme : add LLMUnity to UI projects (#9381) Antonis Makropoulos 2024-09-09 14:21:38 +0300
  • 8d8aa81dcc add newline to examples/rpc/README.md to fix editorconfig-checker unit test Antonis Makropoulos 2024-09-09 14:18:46 +0300
  • cfbf33a705
    ggml : style changes + fix 512-bit nb loop check SVE-vector-length-agnostic-VLA-gg Georgi Gerganov 2024-09-09 12:50:35 +0300
  • 8bd723e5c5 add LLMUnity to UI projects Antonis Makropoulos 2024-09-09 12:34:17 +0300
  • 54f376d0b9
    rpc : update README [no ci] (#9320) Radoslav Gerganov 2024-09-09 11:04:39 +0300
  • 195a062986 make tokenizer_pre consistent; llama.cpp work hoangdz 2024-09-09 16:20:39 +0900
  • b2e89a3274
    Arm AArch64: Documentation updates (#9321) Dan Johansson 2024-09-09 09:02:45 +0200
  • e26d17c0bb
    ci: bump actions/checkout to v4 Trivikram Kamat 2024-09-08 16:23:55 -0700
  • daa9623ab0
    Overlap cmdbuffer creation and cmdbuffer execution in Vulkan backend by submitting smaller cmdbuffers early. (#9118) b3707 Markus Tavenrath 2024-09-08 21:43:48 +0200
  • e079bffb66
    cuda : fix FA Q src index (1 -> 0) (#9374) b3706 Georgi Gerganov 2024-09-08 22:01:02 +0300
  • f0de0bf28e
    Merge 39ae18444f into 3f7ccfd649 curvedinf 2024-09-08 18:13:39 +0200
  • 3f7ccfd649
    common : bring back missing args, add env var duplication check (#9375) b3705 Xuan Son Nguyen 2024-09-08 18:08:55 +0200
  • 374acb9392 correct default values Xuan Son Nguyen 2024-09-08 17:29:58 +0200
  • 944ea66861
    cuda : fix FA Q src index (1 -> 0) Georgi Gerganov 2024-09-08 18:23:30 +0300
  • 9b04a44325 add check for duplicated env var Xuan Son Nguyen 2024-09-08 17:21:07 +0200
  • b5dd43555a move duplication check to test-arg-parser Xuan Son Nguyen 2024-09-08 17:18:17 +0200
  • 056822ec4f common : bring back missing args Xuan Son Nguyen 2024-09-08 17:12:33 +0200
  • d19101c9a0 imatrix : use FMA and sort tensor names Francis Couture-Harpin 2024-09-08 11:03:59 -0400
  • a249843d89
    common : restore --n-gpu-layers (#9371) b3704 slaren 2024-09-08 16:44:42 +0200
  • 3ad0603c65 Merge branch 'master' into compilade/imatrix-batched-chunks Francis Couture-Harpin 2024-09-08 10:05:08 -0400
  • c8ab6a3ba3 imatrix : fix conversion problems Francis Couture-Harpin 2024-09-08 10:04:01 -0400
  • 19f4a7b296
    llama : refactor samplers internal implementation (#9370) b3703 slaren 2024-09-08 15:52:07 +0200
  • 6c95dfe829 remove outdated comment slaren 2024-09-08 15:17:28 +0200
  • e1c4fb7f9c fix LLAMA_TOKEN_NULL checks in penalties sampler slaren 2024-09-08 15:16:50 +0200
  • 80f6666b26 common : restore --n-gpu-layers slaren 2024-09-08 15:07:46 +0200
  • f3ecf6d740 llama : refactor samplers internal implementation slaren 2024-09-08 15:05:05 +0200
  • 2a358fb0c4
    [SYCL] add check malloc result on device (#9346) b3702 Neo Zhang Jianyu 2024-09-08 19:05:29 +0800
  • c882647e7c Direct I/O and Transparent HugePages Pavel Fatin 2024-05-20 21:55:33 +0200
  • eae597182c
    llama : sanitize tokens in the upper bound (#9359) b3701 slaren 2024-09-08 12:41:51 +0200
  • 00b02bb249
    imatrix : fix arg parser for imatrix (#9366) b3700 Xuan Son Nguyen 2024-09-08 12:12:17 +0200
  • 08d9acd981 beautify printing first arg Xuan Son Nguyen 2024-09-08 11:38:03 +0200
  • 01da813dc7 imatrix : fix arg parser Xuan Son Nguyen 2024-09-08 11:28:57 +0200
  • e106feb048 update for review comments, check all malloc_device() result arthw 2024-09-08 17:23:33 +0800
  • 9dc0223390 Fix some nodes are not checked with GGML_VULKAN_CHECK_RESULTS enabled. Markus Tavenrath 2024-09-08 11:19:34 +0200
  • dffb4b1909
    Merge db78320b4d into a876861455 jaime-m-p 2024-09-08 10:06:51 +0200
  • a876861455 metal : update support condition for im2col + fix warning (#0) b3699 Georgi Gerganov 2024-09-08 09:57:57 +0300
  • 385decbd63 sync : ggml Georgi Gerganov 2024-09-08 09:38:56 +0300
  • 60a3107ccd scripts : option to increase git patch context Georgi Gerganov 2024-09-08 09:38:42 +0300
  • 406c1a32a1 vulkan: add dryrun support to sin and cos ops (ggml/947) Salvatore Mesoraca 2024-09-06 14:34:25 +0200
  • 9cb9260861 vulkan: correctly report support for OP_CONT (ggml/946) Salvatore Mesoraca 2024-09-06 14:34:07 +0200
  • 202084d31d tests: add gradient tests for all backends (ggml/932) Johannes Gäßler 2024-09-03 17:21:46 +0200