Commit Graph

  • 79a8176883
    server : add "tokens" output Georgi Gerganov 2024-12-16 21:03:24 +0200
  • dfd5207590
    Merge 244811d856 into 08ea539df2 Dmitry Wolf 2024-12-16 19:39:43 +0530
  • aa13d69905 fix erros in EditorConfig Checker Zhiyuan Li 2024-12-16 19:34:17 +0800
  • 353c5f8c7b add uma support Zhiyuan Li 2024-12-16 18:43:22 +0800
  • 08ea539df2
    unicode : improve naming style (#10838) Georgi Gerganov 2024-12-16 12:31:45 +0200
  • 644fd71b44
    sampling : refactor + optimize penalties sampler (#10803) Georgi Gerganov 2024-12-16 12:31:14 +0200
  • 6ea605ddfc add [[unroll]] and remove unnecessary conditions Zhiyuan Li 2024-12-16 17:35:44 +0800
  • b58ebf30ae
    webui : update Georgi Gerganov 2024-12-16 11:25:17 +0200
  • e27c711981
    llama : minor Georgi Gerganov 2024-12-15 11:36:25 +0200
  • 60d26ded4b
    readme : restore hint about --ignore-eos flag [no ci] Georgi Gerganov 2024-12-13 14:04:09 +0200
  • 685c84c35e
    common : move back the penalties at the front of the sampling chain Georgi Gerganov 2024-12-13 12:54:10 +0200
  • 1ff9296253
    common : ignore all EOG tokens Georgi Gerganov 2024-12-12 22:50:34 +0200
  • 97261aa216
    common : by default, move the penalties at the end of the sampling chain Georgi Gerganov 2024-12-12 22:29:09 +0200
  • 9847a375f3
    params : allow penalty_last_n == -1 to be equal to context size Georgi Gerganov 2024-12-12 21:55:20 +0200
  • a04a5b526b
    batched : remove penalties sampler Georgi Gerganov 2024-12-12 21:33:28 +0200
  • 58a5c3bb0f
    common : apply ignore_eos as logit bias Georgi Gerganov 2024-12-12 21:22:33 +0200
  • 0a1f7fb66d
    sampling : refactor + optimize penalties sampler Georgi Gerganov 2024-12-12 20:39:16 +0200
  • d349f61380 Change model name Billel Mokeddem 2024-12-16 07:52:25 +0000
  • 5cb6209de5 Fixes for building SYCL backend for AMD GPUs lhl 2024-12-16 16:44:57 +0900
  • 5d46c48137 fix musa build on aarch64 Huaishun Hu 2024-12-16 14:42:53 +0800
  • 2607b7de0f
    SYCL: Integrate debug logs with GGML_LOG and other fixes Akarshan Biswas 2024-12-16 11:27:38 +0530
  • c656d929ff multi row k quant shaders! Eve 2024-12-15 23:34:50 -0500
  • 93776cfd74
    Merge 883dc22d44 into 4ddd199f6f Robert 2024-12-16 00:07:01 +0100
  • 605339ec95
    Merge 4ff0831ce6 into 4ddd199f6f Georgi Gerganov 2024-12-15 14:41:08 -0800
  • 658795b84f
    Merge 7e3feff073 into 4ddd199f6f Olivier Chafik 2024-12-15 14:41:08 -0800
  • 663716ce42 only s_warptile_mmq needs to be run with 32 threads or more Eve 2024-12-15 15:47:24 -0500
  • 4ddd199f6f
    llava : Allow locally downloaded models for QwenVL (#10833) Bartowski 2024-12-15 15:43:25 -0500
  • a0974156f3
    llama : add Deepseek MoE v1 & GigaChat models (#10827) b4333 Valentin Mamedov 2024-12-16 00:02:46 +0700
  • 87cf323cef
    scripts : change build path to "build-bench" for compare-commits.sh (#10836) Georgi Gerganov 2024-12-15 18:44:47 +0200
  • 19ce4b64b7
    SYCL: Add pragma directive to suppress warning spam Akarshan Biswas 2024-12-15 21:02:43 +0530
  • 7778b89d30 add test for no dangling pointers Johannes Gäßler 2024-12-15 15:18:01 +0100
  • 5ed4403558
    SYCL: Add back static to ggml_backend_buffer_is_sycl_split function Akarshan Biswas 2024-12-15 19:22:52 +0530
  • 0662a86809
    SYCL: remove extra space Akarshan Biswas 2024-12-15 19:18:52 +0530
  • f8603b0cc0
    SYCL: fix assertions and add proper comments Akarshan Biswas 2024-12-15 19:11:43 +0530
  • 0c6eafdac1 change placement of gigachat chat template Valentin Mamedov 2024-12-15 20:32:42 +0700
  • 6cdb3d86f9 Merge remote-tracking branch 'upstream/master' into gigachat-model Valentin Mamedov 2024-12-15 20:24:29 +0700
  • 78ef42665b move deepseek above deepseek2 Valentin Mamedov 2024-12-15 20:22:41 +0700
  • f8ff867a63
    Merge 739648f3e6 into 5478bbcd17 Heiner 2024-12-15 14:19:48 +0100
  • da40c42062
    SYCL: common.cpp try to migrate away from tensor->backend Akarshan Biswas 2024-12-15 18:41:58 +0530
  • 6ee759966c use std::string instead of static char Judd 2024-12-15 20:01:04 +0800
  • 5478bbcd17
    server: (UI) add syntax highlighting and latex math rendering (#10808) b4331 Vinesh Janarthanan 2024-12-15 05:55:54 -0600
  • b5ae1ddff9
    gguf-py : bump to v0.13.0 gguf-v0.13.0 Georgi Gerganov 2024-12-15 13:16:42 +0200
  • 39f8347504 use id for color; simple_hash removed. Judd 2024-12-15 19:00:36 +0800
  • 3e92f4ecbe
    cont [no ci] gg/unicode-refactor Georgi Gerganov 2024-12-15 12:36:03 +0200
  • 8c2233ac06
    rm trailing space Xuan Son Nguyen 2024-12-15 11:35:15 +0100
  • 7a20c287c7
    unicode : improve naming style Georgi Gerganov 2024-12-15 12:24:04 +0200
  • 7e9208e408
    scripts : change build path to "build-bench" for compare-commits.sh gg/compare-change-path Georgi Gerganov 2024-12-15 11:47:30 +0200
  • 8e69669007
    Fix compilation on Pop!_OS 22.04 LTS CUDA Mika Pi 2024-12-15 00:43:30 -0800
  • 5806435526 Merge remote-tracking branch 'upstream/master' into gigachat-model Valentin Mamedov 2024-12-15 14:32:26 +0700
  • 6e13df8d57 remove comments Valentin Mamedov 2024-12-15 14:05:17 +0700
  • 43c679507f fix order of deepseek and deepseek2 in constants; mark shared exp as deepseek arch need Valentin Mamedov 2024-12-15 13:53:42 +0700
  • b32159c8a7 fix order of deepseek and deepseek2, move gigachat temlate to the end of func Valentin Mamedov 2024-12-15 13:42:33 +0700
  • 66e59b0155 lint llama.cpp Valentin Mamedov 2024-12-15 13:37:12 +0700
  • 35bff171af
    Migrate to tensor->buffer for checking backend buffer type: 1 Akarshan Biswas 2024-12-15 11:45:43 +0530
  • 7e3feff073 tool-call: stabilize server tests ochafik 2024-12-15 00:16:12 +0000
  • 89d604f2c8
    server: Fix has_next_line in JSON response (#10818) gguf-v0.12.0 b4329 Michelle Tan 2024-12-14 22:29:45 +0000
  • 7bfd83ce05 fix memory leak failure Johannes Gäßler 2024-12-14 22:43:58 +0100
  • 107b3538d0
    Define model_path Bartowski 2024-12-14 16:19:31 -0500
  • b01af274c7
    Allow locally downloaded models for QwenVL Bartowski 2024-12-14 15:54:22 -0500
  • e52aba537a
    nix: allow to override rocm gpu targets (#10794) Evgeny Kurnevsky 2024-12-14 18:17:36 +0000
  • fecf662ec1 try Windows fix Johannes Gäßler 2024-12-14 18:07:46 +0100
  • baa8b5d2d2 try macOS fix Johannes Gäßler 2024-12-14 16:43:39 +0100
  • f220234fe1 Clean up: Fix lint. MichelleTPY 2024-12-14 15:42:55 +0000
  • 558e690614 Clean up: Fix lint. MichelleTPY 2024-12-14 15:41:15 +0000
  • 7bfcd0a8dd Merge remote-tracking branch 'origin/master' into tool-call ochafik 2024-12-14 15:08:00 +0000
  • 1e2115ffb9 tool-calls: shorter name: grammar_triggers ochafik 2024-12-14 15:05:18 +0000
  • 055053c859 Merge remote-tracking branch 'origin/master' into tool-call ochafik 2024-12-14 15:04:45 +0000
  • 299d681c52 tests: add tests for GGUF Johannes Gäßler 2024-12-10 15:50:27 +0100
  • 0579e3bf65 Refactor: Add llamma_ prefix in unicode.h unicode.cpp MichelleTPY 2024-12-14 14:18:25 +0000
  • 858dad8d91 latex codeblock as code Xuan Son Nguyen 2024-12-14 14:46:55 +0100
  • 7985295afb fix format Valentin Mamedov 2024-12-14 20:42:59 +0700
  • 64c16c4ae0 Merge branch 'master' into vulkan Zhiyuan Li 2024-12-14 21:28:50 +0800
  • 89714175e7 remove comment Valentin Mamedov 2024-12-14 20:22:19 +0700
  • f3d0a23fe5 delete comments Valentin Mamedov 2024-12-14 20:20:45 +0700
  • ca168fc7a4 add readme Valentin Mamedov 2024-12-14 20:00:01 +0700
  • 2d30fd4457 improve template code Valentin Mamedov 2024-12-14 19:59:06 +0700
  • 504121ec4b fix warnings; remove ggml_backend_sched_splits_fdump_dot. Judd 2024-12-14 20:55:16 +0800
  • ba1cb19cdd
    llama : add Qwen2VL support + multimodal RoPE (#10361) b4327 HimariO 2024-12-14 20:43:46 +0800
  • 9f89d7d8e4 Merge remote-tracking branch 'fork/master' Valentin Mamedov 2024-12-14 15:32:30 +0300
  • da8cf83f86 Add deepseek v1 arch & gigachat template Valentin Mamedov 2024-12-14 15:23:40 +0300
  • 7db99a044e Address code review comment: type check for has_new_line in unit test MichelleTPY 2024-12-14 12:05:39 +0000
  • cd4015643f add comment Xuan Son Nguyen 2024-12-14 12:58:09 +0100
  • bb4e17fc70 fix latex rendering Xuan Son Nguyen 2024-12-14 12:55:21 +0100
  • 10aa898c83 ability to add a demo conversation for dev Xuan Son Nguyen 2024-12-14 12:35:59 +0100
  • 046c0d77a9 llama : use zero value of n_swa to distinguish Phi-4 from other PHI3 models Stanisław Szymczyk 2024-12-14 12:00:19 +0100
  • c7fdbd3735 convert-hf : use zero value of sliding_window to distinguish Phi-4 from other PHI3 models Stanisław Szymczyk 2024-12-14 11:59:59 +0100
  • 520e8a0377 convert-hf : do not use model name to distinguish Phi-4 from Phi-3 Stanisław Szymczyk 2024-12-14 11:28:14 +0100
  • 12d8cd683d add ggml_backend_sched_dump_dot Judd 2024-12-14 18:09:39 +0800
  • f96909e2fd
    remote old rope_section compare operator HimariO 2024-12-14 11:58:38 +0800
  • 6110a9b36e
    Merge branch 'master' into vulkan_llvmpipe Eve 2024-12-14 01:33:59 +0000
  • 4de800cc82
    Merge branch 'ggerganov:master' into server-update-JSON-response Michelle Tan 2024-12-14 00:43:39 +0000
  • 29e6298d2e Remove has_new_line unit test changes. MichelleTPY 2024-12-14 00:36:33 +0000
  • a2e03b826f fix: Use gpt2 tokenizer for roberta and add eos/bos tokens Gabe Goodhart 2024-12-13 16:41:40 -0700
  • 56eea0781c
    Removes spurious \r in output that causes logging in journalctl to treat lines as binary and therefore hidden by default (#10771) b4326 cduk 2024-12-13 23:21:49 +0100
  • 3c8a053459
    Merge branch 'ggerganov:master' into server-update-JSON-response Michelle Tan 2024-12-13 21:44:11 +0000
  • a76c56fa1a
    Introducing experimental OpenCL backend with support for Qualcomm Adreno GPUs (#10693) b4325 lhez 2024-12-13 12:23:52 -0800
  • 2e0f15fd91
    Merge 8545425976 into c27ac678dd Wang Qin 2024-12-13 20:47:10 +0100
  • 220cf7f780 Add unit test to check has_new_line JSON response MichelleTPY 2024-12-13 19:22:15 +0000
  • 9697d07b21 opencl: update log message for unsupported GPUs Max Krasnyansky 2024-12-13 10:31:54 -0800
  • dbaa360a55 opencl: check for various requirements, allow deprecated API Li He 2024-12-12 23:06:17 -0800