cturan/llama.cpp

Autor	SHA1 Wiadomość	Data
Kawrakow	bd2d4e393b 1.5 bit quantization (#5453)	1 rok temu
snadampal	a07d0fee1f ggml : add mmla kernels for quantized GEMM (#4966)	2 lat temu
Kawrakow	c6b395535a ggml : make use of ggml-quants.h possible in C++ code (#5338)	2 lat temu
Kawrakow	f4d7e54974 SOTA 3-bit quants (#5196)	2 lat temu
Georgi Gerganov	38566680cd ggml : add IQ2 to test-backend-ops + refactoring (#4990)	2 lat temu
Kawrakow	334a835a1c ggml : importance matrix support for legacy quants (#4969)	2 lat temu
Kawrakow	467a882fd2 Add ability to use importance matrix for all k-quants (#4930)	2 lat temu
Kawrakow	147b17ac94 2-bit quantizations (#4897)	2 lat temu
Kawrakow	49662cbed3 ggml : SOTA 2-bit quants (add IQ2_XS) (#4856)	2 lat temu
Kawrakow	dd5ae06405 SOTA 2-bit quants (#4773)	2 lat temu
Georgi Gerganov	d061bf9405 ggml : fix q2_k bpw in comments (ggml/680)	2 lat temu
Georgi Gerganov	207b51900e ggml : move FP16 <-> FP32 code to ggml-impl.h (#3861)	2 lat temu
Georgi Gerganov	d69d777c02 ggml : quantization refactoring (#3833)	2 lat temu