Aaron Miller
f0cd38b9ad
add mat*mat ops
2023-11-03 17:22:22 -04:00
Adam Treat
09d83f0401
Delete TODO now that we have q8_0.
2023-11-03 17:22:22 -04:00
Aaron Miller
8564f79036
falcon h2d + reenable vulkan
2023-11-03 17:22:22 -04:00
Aaron Miller
020b1745a0
vulkan: implement neox mode for rope
2023-11-03 17:22:21 -04:00
Aaron Miller
ff4212d20f
q8 mat*vec
2023-11-03 17:22:21 -04:00
Aaron Miller
9db90cbe12
f16 mv broadcasting fix (gqa fix)
2023-11-03 17:22:21 -04:00
Cebtenzzre
3d850db767
kompute : remove Q6_K from list of supported quant types
2023-11-03 17:22:21 -04:00
Cebtenzzre
24a4a5956a
kompute : only try to use Vulkan for LLaMA itself
2023-11-03 17:22:21 -04:00
Adam Treat
bc4b5ed1cb
Fixes for subgroup size to bring AMD and NVIDIA inline with eachother for all kernels.
2023-11-03 17:22:21 -04:00
Adam Treat
de589ced7c
Change this back to be in agreement with metal and our previous softmax kernel.
2023-11-03 17:22:21 -04:00
Adam Treat
6ac39752bf
Fixup the upstream CMakelists.txt so we can build just llama.cpp with our branch.
2023-11-03 17:22:21 -04:00
Adam Treat
32289aa447
Fixes for norm.
2023-11-03 17:22:21 -04:00
Adam Treat
06d4b21598
Fix offset into the qh and now we have working vulkan accelerated for gguff'd llama.
2023-11-03 17:22:21 -04:00
Adam Treat
f1c9bc1821
Add q6_k getrows and mul*vec kernel.
2023-11-03 17:22:21 -04:00
Adam Treat
4b223ec432
Refactor getrows to use common code and get ready for q6_k.
2023-11-03 17:22:21 -04:00
Adam Treat
5509f74318
Minor cleanup.
2023-11-03 17:22:21 -04:00
Adam Treat
601905e75e
Move the subgroups and printf into common.
2023-11-03 17:22:21 -04:00
Adam Treat
93306f16d0
Consolidate code for mat x vec kernels and use subgroups more extensively.
2023-11-03 17:22:21 -04:00
Adam Treat
77135a3bf5
Add a common boilerplate code via include and elim copy pasta
2023-11-03 17:22:21 -04:00
Adam Treat
9e4f8b4acc
Upload immediately to device.
2023-11-03 17:22:21 -04:00
Cebtenzzre
6b6c73a9e3
kompute : don't fail build because of -Warray-bounds
...
There are some warnings in debug builds that are likely to be false
positives.
2023-11-03 17:22:21 -04:00
Adam Treat
1b1416d7b7
Support for gguf.
2023-11-03 17:22:20 -04:00
Adam Treat
2c24d67e7b
Don't crash on available devices if we can't even create an instance.
2023-10-05 13:39:18 -04:00
Adam Treat
addac25293
Set the singleton to nullptr here.
2023-10-05 13:39:18 -04:00
Adam Treat
68aca6be08
Only use vulkan with known quant that work.
2023-10-05 13:39:18 -04:00
Adam Treat
4ed25b2f88
Sync from device back to host at begin of new prompt.
2023-10-05 13:39:18 -04:00
Adam Treat
bd5f6399bb
Don't try and install kompute artifacts.
2023-10-05 13:39:18 -04:00
Aaron Miller
8bea719879
vulkan: disambiguate gpus with the same name
2023-10-05 13:39:18 -04:00
Adam Treat
68cf1df6fb
Throw an exception when allocation fails for vulkan.
2023-10-05 13:39:18 -04:00
Aaron Miller
beee57266f
Make kompute actually include external SDK headers when requested
2023-10-05 13:39:18 -04:00
Adam Treat
b7e2e691d4
Completely revamp how we do object management with the vulkan backend and
...
stop using so many static objects so we can tear down and bring up vulkan
on new devices in the same runtime.
2023-10-05 13:39:18 -04:00
Adam Treat
45c8778b49
Switch to a dynamic dispatch table instead of linking hard against libvulkan.
2023-10-05 13:39:18 -04:00
Aaron Miller
8563fa001f
remove dynamic deps from kompute build
...
should no longer have new external deps other than libvulkan
```
ubuntu@ip-172-31-1-24:~/repo/gpt4all/gpt4all-backend/build$ ldd ./libllamamodel-mainline-avxonly.so
linux-vdso.so.1 (0x00007ffcb53bb000)
libvulkan.so.1 => /lib/x86_64-linux-gnu/libvulkan.so.1 (0x00007f239dab5000)
libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f239d800000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f239d719000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f239da95000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f239d400000)
/lib64/ld-linux-x86-64.so.2 (0x00007f239dd1d000)
```
2023-10-05 13:39:18 -04:00
Adam Treat
48a45ea435
Remove warning which fails on windows.
2023-10-05 13:39:18 -04:00
niansa
ba15dfd0be
Nomic vulkan backend licensed under the Software for Open Models License (SOM), version 1.0.
2023-10-05 13:39:18 -04:00
Kevin Ji
45855b3f1c
docs : mark code as Bash ( #3375 )
2023-09-28 09:11:32 -04:00
Pierre Alexandre SCHEMBRI
4aea3b846e
readme : add Mistral AI release 0.1 ( #3362 )
2023-09-28 15:13:37 +03:00
slaren
da0400344b
ggml-cuda : perform cublas fp16 matrix multiplication as fp16 ( #3370 )
...
* ggml-cuda : perform cublas fp16 matrix multiplication as fp16
* try to fix rocm build
* restrict fp16 mat mul to volta and up
2023-09-28 13:08:28 +03:00
Zhang Peiyuan
e519621010
convert : remove bug in convert.py permute function ( #3364 )
2023-09-27 20:45:20 +02:00
Richard Roberson
ac43576124
make-ggml.py : compatibility with more models and GGUF ( #3290 )
...
* Resync my fork with new llama.cpp commits
* examples : rename to use dash instead of underscore
* New model conversions
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-09-27 19:25:12 +03:00
Cebtenzzre
20c7e1e804
gguf : fix a few general keys ( #3341 )
2023-09-27 12:18:07 -04:00
Rickard Hallerbäck
dc6897404e
metal : reusing llama.cpp logging ( #3152 )
...
* metal : reusing llama.cpp logging
* cmake : build fix
* metal : logging callback
* metal : logging va_args memory fix
* metal : minor cleanup
* metal : setting function like logging macro to capital letters
* llama.cpp : trailing whitespace fix
* ggml : log level enum used by llama
* Makefile : cleanup ggml-metal recipe
* ggml : ggml_log_callback typedef
* ggml : minor
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-09-27 18:48:33 +03:00
Jag Chadha
527e57cfd8
build : add ACCELERATE_NEW_LAPACK to fix warning on macOS Sonoma ( #3342 )
2023-09-27 18:34:32 +03:00
BarfingLemurs
ffe88a36a9
readme : add some recent perplexity and bpw measurements to READMES, link for k-quants ( #3340 )
...
* Update README.md
* Update README.md
* Update README.md with k-quants bpw measurements
2023-09-27 18:30:36 +03:00
DAN™
99115f3fa6
cmake : fix build-info.h on MSVC ( #3309 )
2023-09-25 18:45:33 -04:00
2f38b454
1726f9626f
docs: Fix typo CLBlast_DIR var. ( #3330 )
2023-09-25 20:24:52 +02:00
Erik Scholz
a98b1633d5
nix : add cuda, use a symlinked toolkit for cmake ( #3202 )
2023-09-25 13:48:30 +02:00
slaren
c091cdfb24
llama-bench : add README ( #3317 )
...
* llama-bench : add README
* minor edit
2023-09-23 21:48:24 +02:00
Cebtenzzre
51a7cf5c6e
examples : fix RoPE defaults to match PR #3240 ( #3315 )
2023-09-23 12:28:50 +03:00
Kevin Ji
bedb92b603
scripts : use /usr/bin/env
in shebang ( #3313 )
2023-09-22 23:52:23 -04:00