compilade
|
e54d41befc
gguf-py : add Numpy MXFP4 de/quantization support (#15111)
|
5 months ago |
compilade
|
9bc6db28d0
ggml-quants : ternary packing for TriLMs and BitNet b1.58 (#8151)
|
1 year ago |
compilade
|
4134999e01
gguf-py : Numpy dequantization for most types (#8939)
|
1 year ago |
compilade
|
3a14e00366
gguf-py : simplify support for quant types (#8838)
|
1 year ago |
Sigbjørn Skjæret
|
b72c20b85c
Fix conversion of unnormalized BF16->BF16 weights (#7843)
|
1 year ago |
Xuan Son Nguyen
|
97bdd26eee
Refactor lora adapter support (#8332)
|
1 year ago |
compilade
|
b83bab15a5
gguf-py : fix and simplify quantized shape round-trip (#7483)
|
1 year ago |
compilade
|
ee52225067
convert-hf : support direct Q8_0 conversion (#7234)
|
1 year ago |