Rene Leonhardt 5c4d767ac0 chore: Fix markdown warnings (#6625) 1 rok temu
..
CMakeLists.txt b12fa0d1c1 build : link against build info instead of compiling against it (#3879) 2 lat temu
README.md 5c4d767ac0 chore: Fix markdown warnings (#6625) 1 rok temu
quantize.cpp 08a0c02060 ggml : mul_mat_id use the same tensor for all the experts (#6387) 1 rok temu

README.md

quantize

TODO

Llama 2 7B

Quantization Bits per Weight (BPW)
Q2_K 3.35
Q3_K_S 3.50
Q3_K_M 3.91
Q3_K_L 4.27
Q4_K_S 4.58
Q4_K_M 4.84
Q5_K_S 5.52
Q5_K_M 5.68
Q6_K 6.56

Llama 2 13B

Quantization | Bits per Weight (BPW) -- | -- Q2_K | 3.34 Q3_K_S | 3.48 Q3_K_M | 3.89 Q3_K_L | 4.26 Q4_K_S | 4.56 Q4_K_M | 4.83 Q5_K_S | 5.51 Q5_K_M | 5.67 Q6_K | 6.56

Llama 2 70B

Quantization | Bits per Weight (BPW) -- | -- Q2_K | 3.40 Q3_K_S | 3.47 Q3_K_M | 3.85 Q3_K_L | 4.19 Q4_K_S | 4.53 Q4_K_M | 4.80 Q5_K_S | 5.50 Q5_K_M | 5.65 Q6_K | 6.56