cturan/llama.cpp

miroir de https://github.com/cturan/llama.cpp

Auteur	SHA1 Message	Date
Alfred	ce734a8a2f ggml-hexagon: Implement true Q8_0 quantization on Hexagon NPU for more accurate mixed-precision matmul operations (#17977)	il y a 4 semaines
Max Krasnyansky	63d2fc46e1 Add experimental ggml-hexagon backend for the Hexagon NPU (#16547)	il y a 2 mois