mirror of
https://github.com/ggerganov/llama.cpp.git
synced 2024-11-15 15:29:53 +00:00
b72c20b85c
* add truncate_bf16 * truncate intermediate fp32 if converting bf16 to bf16 * fix masking in __compute_fp32_to_bf16 * np.int16 no longer used * missing cast and additional numpy 2.x fix * ggml-impl : do not flush bf16 subnormals to zero * ggml : add reference fp32 to bf16 conversion The fast version is no longer equivalent for all platforms because of the handling of subnormal values. * gguf-py : remove flush to zero for bf16 subnormals * gguf-py : remove float32 truncation to bf16 Rounding achieves the same thing in the cases where this was used. * missed prototype update in merge * merge cleanup --------- Co-authored-by: Francis Couture-Harpin <git@compilade.net> |
||
---|---|---|
.. | ||
ggml-alloc.h | ||
ggml-backend.h | ||
ggml-blas.h | ||
ggml-cann.h | ||
ggml-cuda.h | ||
ggml-kompute.h | ||
ggml-metal.h | ||
ggml-rpc.h | ||
ggml-sycl.h | ||
ggml-vulkan.h | ||
ggml.h |