Kawrakow
|
cbc8343619
Make IQ1_M work for QK_K = 64 (#6327)
|
1 year ago |
Kawrakow
|
55c1b2a3bb
IQ1_M: 1.75 bpw quantization (#6302)
|
1 year ago |
Georgi Gerganov
|
8030da7afe
ggml : reuse quantum structs across backends (#5943)
|
1 year ago |
Kawrakow
|
44ca159faf
1.5 bit: we can do even better (#5999)
|
1 year ago |
Kawrakow
|
be858f6205
Better 1.5 bit quantization (#5971)
|
1 year ago |
Georgi Gerganov
|
bf47a5eefc
ggml : remove __constant__ specifier for CUDA tables (#5940)
|
1 year ago |
Georgi Gerganov
|
8a3012a4ad
ggml : add ggml-common.h to deduplicate shared code (#5940)
|
1 year ago |