Kawrakow 66d575c45c llama : add Q3_K_XS (#5060) 2 سال پیش
..
CMakeLists.txt b12fa0d1c1 build : link against build info instead of compiling against it (#3879) 2 سال پیش
README.md ffe88a36a9 readme : add some recent perplexity and bpw measurements to READMES, link for k-quants (#3340) 2 سال پیش
quantize.cpp 66d575c45c llama : add Q3_K_XS (#5060) 2 سال پیش

README.md

quantize

TODO

Llama 2 7B

Quantization | Bits per Weight (BPW) -- | -- Q2_K | 3.35 Q3_K_S | 3.50 Q3_K_M | 3.91 Q3_K_L | 4.27 Q4_K_S | 4.58 Q4_K_M | 4.84 Q5_K_S | 5.52 Q5_K_M | 5.68 Q6_K | 6.56

Llama 2 13B

Quantization | Bits per Weight (BPW) -- | -- Q2_K | 3.34 Q3_K_S | 3.48 Q3_K_M | 3.89 Q3_K_L | 4.26 Q4_K_S | 4.56 Q4_K_M | 4.83 Q5_K_S | 5.51 Q5_K_M | 5.67 Q6_K | 6.56

Llama 2 70B

Quantization | Bits per Weight (BPW) -- | -- Q2_K | 3.40 Q3_K_S | 3.47 Q3_K_M | 3.85 Q3_K_L | 4.19 Q4_K_S | 4.53 Q4_K_M | 4.80 Q5_K_S | 5.50 Q5_K_M | 5.65 Q6_K | 6.56