Commit Graph

  • d7435fe320 fix whitespace, edit README.md jwj7140 2023-06-30 00:03:02 +0900
  • ad945e2c41 make instructions clearer Concedo 2023-06-29 22:13:39 +0800
  • 64aba0a151 update readme Concedo 2023-06-29 21:42:04 +0800
  • b8c8dda75f
    Use unsigned for random seed (#2006) master-b8c8dda Howard Su 2023-06-29 21:15:15 +0800
  • f09debb1ec remove debug Concedo 2023-06-29 20:54:56 +0800
  • 966d736582 revert cublasLt removal Concedo 2023-06-29 20:51:02 +0800
  • 10a2bdfaf1 Merge remote-tracking branch 'upstream/ik/context_extend' into concedo_experimental Concedo 2023-06-29 20:35:17 +0800
  • 749d6179a8 Snake case all functions niansa 2023-06-29 14:23:00 +0200
  • c7c6e522e7 bigger scratch buffers for bigger context Concedo 2023-06-29 19:43:23 +0800
  • 86b061b98c wip on unified cublas integration, add all the small libraries but exclude the large ones Concedo 2023-06-29 18:35:31 +0800
  • c2f1ed6556 fix compile errors Concedo 2023-06-29 17:54:12 +0800
  • dff5575647 Merge branch 'master' into concedo_experimental Concedo 2023-06-29 17:35:28 +0800
  • 5ac68ccacb Cleanups niansa 2023-06-29 11:14:21 +0200
  • 4b3a1282f0 Add flag for lowvram directly into cublas launch param Concedo 2023-06-29 17:07:31 +0800
  • 13c8d87111 breaking change: deprecate GGML_TASK_INIT and GGML_TASK_FINALIZE. Will not be scheduled unless explicitly enabled. mqy 2023-06-29 17:06:00 +0800
  • 746f5fa9e9 update lite Concedo 2023-06-29 16:44:39 +0800
  • f8baad235d use struct for grammar elements and add Unicode support Evan Jones 2023-06-20 00:06:38 -0400
  • 96a712ca1b
    Porting the improved K-Quant CUDA kernels to OpenCL (#1966) LostRuins 2023-06-29 11:56:43 +0800
  • d51a890546
    Add CFLAGS when compiling ggml-cuda.o Ning Zhang 2023-06-28 20:28:32 -0700
  • fcb0a77b13 address feedbacks to avoid using exceptions randxie 2023-06-29 07:49:31 +0800
  • 7c6121eb64
    use uint32_t for seed Howard Su 2023-06-28 15:52:23 -0700
  • d7d454f227
    Use uint32_t for seed Howard Su 2023-06-28 15:51:56 -0700
  • e1db1c63bd Normalize graph of training Howard Su 2023-06-26 09:26:57 +0800
  • 04419f1894
    Merge 'origin/master' into hipblas Henri Vasserman 2023-06-28 23:30:10 +0300
  • bb16effc75
    headers fix; add kquants_iter for hipblas and add gfx803 (#1) YellowRoseCx 2023-06-28 15:27:10 -0500
  • 2d55023143 dequantize + matrix multiplication CUDA kernels JohannesGaessler 2023-06-26 10:27:38 +0200
  • 1d22550f87
    Merge branch 'ggerganov:master' into master WangHaoranRobin 2023-06-28 13:08:25 -0700
  • 944059f7b6
    Create codacy.yml m3ndax 2023-06-28 21:10:36 +0200
  • 11381f9ad7
    Merge branch 'ggerganov:master' into master m3ndax 2023-06-28 20:39:58 +0200
  • d3494bb86b
    llama : replacing auto &kv with const auto &kv (#2041) master-d3494bb m3ndax 2023-06-28 20:39:08 +0200
  • b4a1acf51d
    Delete codacy.yml m3ndax 2023-06-28 20:26:33 +0200
  • 082d3b3628
    Create codacy.yml m3ndax 2023-06-28 20:18:51 +0200
  • 436069cf9f Replacing auto &kv with const auto &kv mendax0110 2023-06-28 20:14:41 +0200
  • 5b351e94d0
    cuda : remove nchannels_x argument from mul_mat_vec_nc_f16_f32 (#2028) master-5b351e9 Salvador E. Tropea 2023-06-28 14:27:31 -0300
  • 6432aabb6d
    cuda : fix missing const qualifier in casts (#2027) master-6432aab Salvador E. Tropea 2023-06-28 14:26:26 -0300
  • b922bc351b
    llama : remove shards weight file support (#2000) master-b922bc3 Howard Su 2023-06-28 10:13:02 -0700
  • a4149aa0c8 change token count method jwj7140 2023-06-29 02:04:37 +0900
  • 7f9753fa12
    CUDA GPU acceleration for LoRAs + f16 models (#1970) master-7f9753f Johannes Gäßler 2023-06-28 18:35:54 +0200
  • cfa0750bc9
    llama : support input embeddings directly (#1910) ningshanwutuobang 2023-06-28 23:53:37 +0800
  • b084f4dc46 option for cublas Concedo 2023-06-28 21:16:40 +0800
  • de7d1823ed Implemented ggml_vk_soft_max niansa 2023-06-28 12:48:41 +0200
  • a952716d35 try to fix compile warnings on macOS, address issue #2036 mqy 2023-06-28 18:29:02 +0800
  • b4698abafc Wip, CUDA porting malloc improvements, gpu accel for non-llama, backport old quants Concedo 2023-06-28 18:20:46 +0800
  • 3be25a2a40
    Merge 76e5e2719d into 9d23589d63 Qingyou Meng 2023-06-28 03:00:47 -0600
  • 76e5e2719d corner case: when nth is 1, n_multiplier should be 1 mqy 2023-06-28 17:00:34 +0800
  • e2b721db65 Allow vk add row niansa 2023-06-28 10:19:18 +0200
  • ed14f0764a Fixed ggml_vk_abmath row argument niansa 2023-06-28 10:15:23 +0200
  • 69ccfce14a
    Merge 2257f9f691 into 9d23589d63 Howard Su 2023-06-27 22:45:59 -0400
  • c2c6790f9b Avoid unused constant warnings Salvador E. Tropea 2023-06-27 22:03:00 -0300
  • c9d9681a07 Removed nchannels_x argument from mul_mat_vec_nc_f16_f32 Salvador E. Tropea 2023-06-27 21:58:36 -0300
  • 68879d7557 Fixed missing const qualifier in casts Salvador E. Tropea 2023-06-27 21:54:25 -0300
  • a40a69af3b
    Merge afee3cfc1f into 9d23589d63 Hashem Alsaket 2023-06-28 01:39:55 +0200
  • 4afb12fbb3 stdint.h mqy 2023-06-28 06:20:59 +0800
  • e300a91104 stdbool.h mqy 2023-06-28 06:09:19 +0800
  • a1306ce6c6 prevent deadlock; cleanup mqy 2023-06-28 05:59:25 +0800
  • 767d1db097 remove thread local variable: Windows does not recogonize it mqy 2023-06-28 04:51:55 +0800
  • fef9eac856 fix windows build error caused by mis-replacing text mqy 2023-06-28 04:32:44 +0800
  • 6b6770a52d
    Merge branch 'ggerganov:master' into master m3ndax 2023-06-27 22:30:54 +0200
  • 6e7f15ddf8 removed the k-quants changes mendax0110 2023-06-27 22:28:07 +0200
  • b1d402d5fb work stealing chunked task allocator example for issue #291 mqy 2023-06-28 03:26:39 +0800
  • e1abf636a4 fix mistakes jwj7140 2023-06-28 02:18:03 +0900
  • cc5de81208 fix bugs, remove chat format using \n jwj7140 2023-06-28 02:16:00 +0900
  • 9d23589d63
    fix pthreads setaffinity usage on android (#2020) master-9d23589 Erik Scholz 2023-06-27 19:06:33 +0200
  • 67be3fc743 make llama_load_session_file_internal static randxie 2023-06-28 00:40:29 +0800
  • 7283f29bba Remove useless check Howard Su 2023-06-27 09:19:02 -0700
  • a49299bb98 Remove alignment_prevents_mmap which is not more needed. Howard Su 2023-06-27 08:37:18 -0700
  • 73bcc5b144 Remove vocab_only from constructor of llama_model_loader Howard Su 2023-06-27 09:20:10 +0800
  • 333c40b94c Fixed typo Iwan Kawrakow 2023-06-27 19:04:00 +0300
  • ced65a56b0 convert checks in llama_load_session_file to throw and handle them randxie 2023-06-27 22:28:31 +0800
  • 9ce20a1170
    fix pthreads setaffinity usage on android Green Sky 2023-06-27 15:54:51 +0200
  • 7abb513e94 fix ci error ningshanwutuobang 2023-06-27 20:38:52 +0800
  • d99d39981c Merge remote-tracking branch 'origin/master' into embd_inp ningshanwutuobang 2023-06-27 20:33:08 +0800
  • cda30038e4 Modified RoPE with linear scaling Iwan Kawrakow 2023-06-27 15:00:22 +0300
  • 5a16205274 Missing one place to replace -1 with default seed constant Howard Su 2023-06-27 19:48:07 +0800
  • 9527a783ea fix rope inplace Concedo 2023-06-27 19:44:33 +0800
  • 282376c85a Merge branch 'master' into concedo_experimental Concedo 2023-06-27 19:15:27 +0800
  • c8ae94524a
    Merge 'origin/master' into hipblas Henri Vasserman 2023-06-27 10:50:37 +0300
  • d94d0ae367 Remove dead code guess_n_parts function Howard Su 2023-06-26 19:59:49 +0800
  • 712c127773 Simplify load logic Howard Su 2023-06-26 19:57:04 +0800
  • 76752668de Remove llama_load_tensor_shard class Howard Su 2023-06-26 19:24:51 +0800
  • e4bb976c25 Remove multiple file loaders Howard Su 2023-06-26 16:16:45 +0800
  • d8147f236d Remove multiple shards Howard Su 2023-06-26 16:08:02 +0800
  • 74fe5fc1ea Change according to the review Howard Su 2023-06-27 08:17:35 +0800
  • e67b15f5c2 Use unsigned for random seed Howard Su 2023-06-26 22:58:22 +0800
  • 0be54f75a6
    baby-llama : fix build after ggml_rope change (#2016) master-0be54f7 Howard Su 2023-06-27 13:07:13 +0800
  • 58828c209a
    Merge pull request #8 from WangHaoranRobin/robin_fork_master WangHaoranRobin 2023-06-26 18:11:48 -0700
  • bc88fece87 server: fix llama_sample_top_k order Wang Haoran(Robin) 2023-06-26 18:11:27 -0700
  • c7f7f13650
    Merge branch 'ggerganov:master' into master WangHaoranRobin 2023-06-26 18:08:40 -0700
  • eabcfd24b1 Fix build in baby-llama after ggml_rope change Howard Su 2023-06-27 08:23:11 +0800
  • 8afa800fb6 Expose low_vram for CUDA YellowRoseCx 2023-06-26 16:47:22 -0500
  • 181e8d9755
    llama : fix rope usage after ChatGLM change Georgi Gerganov 2023-06-27 00:37:13 +0300
  • d9779021bd
    ggml : add support for ChatGLM RoPE Georgi Gerganov 2023-06-27 00:06:51 +0300
  • d8f3f7089f
    Create codacy.yml m3ndax 2023-06-26 22:15:20 +0200
  • 39011ad7c4 change the order of the args of llama_eval_internal ningshanwutuobang 2023-06-27 04:06:20 +0800
  • 565afba411 use const auto &kv instead of auto &kv, statements into braces mendax0110 2023-06-26 21:59:08 +0200
  • d38e451578
    readme : add Scala 3 bindings repo (#2010) Roman Parykin 2023-06-26 22:47:59 +0300
  • eaa6ca5a61
    ggml : increase max tensor name + clean up compiler warnings in train-text (#1988) master-eaa6ca5 David Yang 2023-06-27 03:45:32 +0800
  • aa777abbb7
    readme : LD_LIBRARY_PATH complement for some Android devices when building with CLBlast inside Termux (#2007) Gustavo Rocha Dias 2023-06-26 16:34:45 -0300
  • 5cc672a9a5
    metal : try to utilize more of the shared memory using smaller views try-fix-metal Georgi Gerganov 2023-06-26 22:23:04 +0300
  • 19e45694c3 Add Scala 3 bindings donderom 2023-06-26 22:00:33 +0300