Commit Graph

  • 207449810e tests: Add im2col perf tests Jeff Bolz 2024-12-21 16:11:22 -0600
  • a04db23fa7 vulkan: initialize some buffer/offset variables Jeff Bolz 2024-12-21 14:09:41 -0600
  • d02e63b64d
    server : set default top-k to 1 in the web ui Georgi Gerganov 2024-12-21 12:20:16 +0200
  • cb1215354b rpc-server : add support for the SYCL backend Radoslav Gerganov 2024-12-21 11:16:42 +0200
  • 9d5c711587
    llama : the WPM vocabs use the CLS token as BOS Georgi Gerganov 2024-12-21 10:22:04 +0200
  • 5cd85b5e00
    convert : add BertForMaskedLM (#10919) Georgi Gerganov 2024-12-21 10:10:18 +0200
  • a91a41364b
    vulkan: optimize coopmat2 dequant functions (#10855) Jeff Bolz 2024-12-21 01:04:45 -0600
  • 59afb0160e
    custom_build_wincuda Jianlin Shi 2024-12-20 22:37:51 -0700
  • 45d3bcd434
    12.6.2 Jianlin Shi 2024-12-20 22:37:03 -0700
  • 2efb2c18cc vulkan: build fixes for 32b Jeff Bolz 2024-12-20 23:20:20 -0600
  • 9a0de02cf3
    Merge 6893f3ac5d into e34c5af43f Brian 2024-12-20 15:55:09 -0800
  • e34c5af43f
    ggml-cpu: replace NEON asm with intrinsics in ggml_gemv_q4_0_4x8_q8_0() (#10874) b4372 Adrien Gallouët 2024-12-21 00:33:37 +0100
  • 774405fff6
    Merge f9e9792f1d into eb5c3dc64b KenForever 2024-12-20 17:12:21 +0000
  • eb5c3dc64b
    SYCL: Migrate away from deprecated ggml_tensor->backend (#10840) b4371 Akarshan Biswas 2024-12-20 21:01:28 +0530
  • 564c23b8fd
    convert : add BertForMaskedLM Georgi Gerganov 2024-12-20 16:54:51 +0200
  • 0ca416c91a
    server : (UI) fix copy to clipboard function (#10916) Xuan Son Nguyen 2024-12-20 14:12:06 +0100
  • 21ae3b9be8
    ggml : add test for SVE and disable when it fails (#10906) b4369 Diego Devesa 2024-12-20 13:31:28 +0100
  • 0dc073a8bf update README Xuan Son Nguyen 2024-12-20 12:49:04 +0100
  • 44e9a474f3 server : add system_fingerprint to chat/completion Xuan Son Nguyen 2024-12-20 12:42:40 +0100
  • 6161e89129 server : (UI) fix copy to clipboard function Xuan Son Nguyen 2024-12-20 11:52:37 +0100
  • 0a11f8b7b5
    convert : fix RWKV v6 model conversion (#10913) b4368 Molly Sophia 2024-12-20 17:44:58 +0800
  • a20a94f566 RWKV 6: Fix error in ggml_cuda_op_bin_bcast Molly Sophia 2024-12-20 15:03:34 +0800
  • ff3d22655b Enable --no-context-shift for llama-perplexity example Molly Sophia 2024-12-20 15:03:00 +0800
  • d70f5fca74 llamafile_sgemm API - INT8 implementation Amrita H S 2024-12-20 01:20:21 -0500
  • 2116f48bec Add support for the cohere2 model architecture. DAN™ 2024-12-19 09:20:55 -0500
  • 5d6cb3a2b8 ggml : add test for SVE and disable when it fails slaren 2024-12-19 21:16:14 +0100
  • fa707391a6 readjust row selection Eve 2024-12-19 12:22:35 -0500
  • d0c0945177 Update llama-run to include temperature option Eric Curtin 2024-12-19 13:58:17 +0000
  • d408bb9268
    clip : disable GPU support (#10896) b4367 Georgi Gerganov 2024-12-19 18:47:15 +0200
  • 5cab3e4aaa
    llama : minor grammar refactor (#10897) b4366 Georgi Gerganov 2024-12-19 17:42:13 +0200
  • 2514f97c14
    Merge dc68a59064 into 36319dec5d Clarissa Miranda 2024-12-19 17:35:19 +0200
  • 36319dec5d
    tts : small QoL for easy model fetch (#10903) b4365 Georgi Gerganov 2024-12-19 17:35:15 +0200
  • 20b6cc8f8e
    tts : small QoL for easy model fetch Georgi Gerganov 2024-12-19 17:12:49 +0200
  • 57bb2c40cd
    server : fix logprobs, make it OAI-compatible (#10783) Xuan Son Nguyen 2024-12-19 15:40:08 +0100
  • 5b966df177 fix setting prob for sampled token Xuan Son Nguyen 2024-12-19 14:39:36 +0100
  • 9b0e901829
    llama : minor grammar refactor Georgi Gerganov 2024-12-19 15:38:56 +0200
  • a3c33b1dce
    ggml: fix arm build with gcc (#10895) b4363 Adrien Gallouët 2024-12-19 14:20:41 +0100
  • 2fffc52b50
    llama : fix Roberta embeddings (#10856) b4362 Sukriti Sharma 2024-12-19 06:04:51 -0700
  • 63b7dd9e61
    clip : disable GPU support Georgi Gerganov 2024-12-19 14:53:32 +0200
  • 489cb3eb6a ggml: fix arm build with gcc Adrien Gallouët 2024-12-19 12:25:45 +0100
  • a217382b25 correct comment placement Xuan Son Nguyen 2024-12-19 12:02:11 +0100
  • 65ef1c8dc9 rename struct token_prob to prob_info Xuan Son Nguyen 2024-12-19 11:51:50 +0100
  • d2463dc8df resolve review comments Xuan Son Nguyen 2024-12-19 11:46:34 +0100
  • fa522bc346 ASCII/Romanization Support for OuteTTS edwko 2024-12-19 12:39:17 +0200
  • 7a21a92ea3
    Create anyascii.h edwko 2024-12-19 12:31:58 +0200
  • 2cfbccd356 ggml-cpu: format code Adrien Gallouët 2024-12-18 20:34:45 +0100
  • 82222b7489 ggml-cpu: replace NEON asm with intrinsics in ggml_gemv_q4_0_4x8_q8_0() Adrien Gallouët 2024-12-17 20:43:07 +0000
  • 7585edbdeb
    convert : Add support for Microsoft Phi-4 model (#10817) b4361 fairydreaming 2024-12-19 10:37:12 +0100
  • 24bad77ebf fix confict pcdack 2024-12-19 09:02:35 +0000
  • f91cf62b89
    Merge branch 'master' into support_glm_edge_model piDack 2024-12-19 16:50:23 +0800
  • 9895f76a3f
    Merge c24778df01 into cd920d0ac3 Yann Follet 2024-12-19 10:44:12 +0200
  • cd920d0ac3
    tests: disable GGUF test for bad value size (#10886) b4360 Johannes Gäßler 2024-12-19 08:53:58 +0100
  • a02c63d710 Support InfiniAI Megrez 3b dixyes 2024-12-19 11:48:22 +0800
  • 63c27eb10b more row choices Eve 2024-12-18 22:23:22 -0500
  • 6be041ae10
    SYCL: Refactor SYCL buffer checks in ggml_sycl_cpy_tensor_2d Akarshan Biswas 2024-12-19 08:34:53 +0530
  • 7bbd9cb297 better row selection Eve 2024-12-18 22:03:53 -0500
  • 7909e8588d
    llama-run : improve progress bar (#10821) b4359 Eric Curtin 2024-12-19 02:58:00 +0000
  • ecad966562 conflict resolution Yee Man Chan 2024-12-19 10:41:32 +0800
  • 62dc17022b merge master Eve 2024-12-18 21:38:13 -0500
  • 0b2f031387 fix linting Sukriti-Sharma4 2024-12-18 18:49:51 -0700
  • be50c0c3b6 Merge branch 'master' into RobertaTokenizer Sukriti-Sharma4 2024-12-18 18:47:16 -0700
  • 334ddfd97d map roberta-bpe to gpt-2 Sukriti-Sharma4 2024-12-18 18:37:00 -0700
  • 9177484f58
    ggml : fix arm build (#10890) b4358 Diego Devesa 2024-12-18 23:21:42 +0100
  • d4b125911c march -> mcpu, skip adding feature macros slaren 2024-12-18 20:50:04 +0100
  • 94bb27257d
    Merge 17db6beda4 into 0bf2d10c55 Robert 2024-12-18 20:32:35 +0100
  • 0a4b79ca3d disable llamafile in android example slaren 2024-12-18 20:24:25 +0100
  • 6f6794f23c remove msvc support, add GGML_CPU_ARM_ARCH option slaren 2024-12-18 19:38:51 +0100
  • 0bf2d10c55
    tts : add OuteTTS support (#10784) b4357 Georgi Gerganov 2024-12-18 19:27:21 +0200
  • c0df192838
    common : support HF download for vocoder Georgi Gerganov 2024-12-18 19:22:56 +0200
  • 3f90d2abeb ggml-cpu: re-add AArch64 NEON assembly for ggml_gemv_q4_0_4x4_q8_0() for non-dotprod Sam Purkis 2024-12-18 15:42:55 +0000
  • fd4cf34b00 "top_probs" with "post_sampling_probs" Xuan Son Nguyen 2024-12-18 17:27:29 +0100
  • 8734df73d9 remove --multi-token-probs Xuan Son Nguyen 2024-12-18 17:15:15 +0100
  • a95191c468
    tts : minor fixes Georgi Gerganov 2024-12-18 17:50:57 +0200
  • dfc28d6194
    Merge 1ee6c482d0 into 7bbb5acf12 compilade 2024-12-18 15:00:20 +0100
  • 7bbb5acf12
    server: avoid overwriting Authorization header (#10878) Gaetan Bisson 2024-12-18 04:00:07 -1000
  • 75fe7751e5 update docs [no ci] Xuan Son Nguyen 2024-12-18 14:32:32 +0100
  • d7de64bc2b
    Merge branch 'master' into add_sched_dot_dump Judd 2024-12-18 21:14:21 +0800
  • ecadd37c63 add post_sampling_probs option Xuan Son Nguyen 2024-12-18 14:11:04 +0100
  • 29df666d44
    tts : enable "return_tokens" in Python example Georgi Gerganov 2024-12-18 14:13:09 +0200
  • 2a1a6f6326
    server : fix rebase artifacts Georgi Gerganov 2024-12-18 14:10:59 +0200
  • edb7896b49
    tts : extend python example to generate spectrogram Georgi Gerganov 2024-12-17 14:21:06 +0200
  • 5038abe1ee
    tts : add Python example for OuteTTS (wip) Georgi Gerganov 2024-12-17 10:27:52 +0200
  • d291c74253
    llama : handle no-vocab detokenization Georgi Gerganov 2024-12-16 21:45:25 +0200
  • 824fa750d4
    llama : update WavTokenizer to non-causal attn Georgi Gerganov 2024-12-17 10:25:17 +0200
  • 2033fb7eef
    cont [no ci] Georgi Gerganov 2024-12-16 20:39:46 +0200
  • 35259e5335
    cont Georgi Gerganov 2024-12-16 19:33:35 +0200
  • 980d631032
    llama : refactor wavtokenizer tensors Georgi Gerganov 2024-12-16 19:21:50 +0200
  • d1ef627c51
    tts : fix tensor shapes Georgi Gerganov 2024-12-16 16:48:22 +0200
  • c096bbd8dd
    tts : remove hardcoded constants Georgi Gerganov 2024-12-16 15:07:38 +0200
  • e70f140c04
    tts : outetts-voc -> wavtokenizer-dec Georgi Gerganov 2024-12-16 13:51:09 +0200
  • befdcd2492
    tts : text pre-processing Georgi Gerganov 2024-12-16 13:37:36 +0200
  • 3d54be4d84
    tts : update default samplers Georgi Gerganov 2024-12-16 13:21:36 +0200
  • 1d7c27ca93
    tts : fixes Georgi Gerganov 2024-12-11 21:42:53 +0200
  • 906a0edb5a
    tts : fix sampling + cut initial noise Georgi Gerganov 2024-12-11 21:33:51 +0200
  • 2221e54278
    tts : add matchematical constant Georgi Gerganov 2024-12-11 20:31:20 +0200
  • d4fa34bdd4
    tts : add header + minor fixes Georgi Gerganov 2024-12-11 19:00:03 +0200
  • 8329e850cc
    tts : minor fix Georgi Gerganov 2024-12-11 18:43:02 +0200
  • db613915de
    clip : fix new conv name Georgi Gerganov 2024-12-11 18:20:17 +0200
  • b9a011e123
    tts : receive input text and generate codes Georgi Gerganov 2024-12-11 18:12:59 +0200
  • 191da330fc
    clean-up Georgi Gerganov 2024-12-11 16:50:40 +0200