mirror of
https://github.com/ggerganov/llama.cpp.git
synced 2024-12-26 03:14:35 +00:00
readme: add missing info (#1324)
This commit is contained in:
parent
2d13786e91
commit
921dcee00a
@ -18,10 +18,12 @@ The main goal of `llama.cpp` is to run the LLaMA model using 4-bit integer quant
|
||||
|
||||
- Plain C/C++ implementation without dependencies
|
||||
- Apple silicon first-class citizen - optimized via ARM NEON and Accelerate framework
|
||||
- AVX2 support for x86 architectures
|
||||
- AVX, AVX2 and AVX512 support for x86 architectures
|
||||
- Mixed F16 / F32 precision
|
||||
- 4-bit integer quantization support
|
||||
- 4-bit, 5-bit and 8-bit integer quantization support
|
||||
- Runs on the CPU
|
||||
- OpenBLAS support
|
||||
- cuBLAS and CLBlast support
|
||||
|
||||
The original implementation of `llama.cpp` was [hacked in an evening](https://github.com/ggerganov/llama.cpp/issues/33#issuecomment-1465108022).
|
||||
Since then, the project has improved significantly thanks to many contributions. This project is for educational purposes and serves
|
||||
|
Loading…
Reference in New Issue
Block a user