Georgi Gerganov 6ff13987ad common : normalize naming style (#7462)		1 rok pred
..
CMakeLists.txt	0c4d489e29 quantize: add imatrix and dataset metadata in GGUF (#6658)	1 rok pred
README.md	ad52d5c259 doc: add references to hugging face GGUF-my-repo quantisation web tool. (#7288)	1 rok pred
quantize.cpp	6ff13987ad common : normalize naming style (#7462)	1 rok pred
tests.sh	2789baf480 tests : fix --keep_split -> --keep-split (#7374)	1 rok pred

quantize

You can also use the GGUF-my-repo space on Hugging Face to build your own quants without any setup.

Note: It is synced from llama.cpp main every 6 hours.

Llama 2 7B

Quantization	Bits per Weight (BPW)
Q2_K	3.35
Q3_K_S	3.50
Q3_K_M	3.91
Q3_K_L	4.27
Q4_K_S	4.58
Q4_K_M	4.84
Q5_K_S	5.52
Q5_K_M	5.68
Q6_K	6.56

Llama 2 13B

Quantization | Bits per Weight (BPW) -- | -- Q2_K | 3.34 Q3_K_S | 3.48 Q3_K_M | 3.89 Q3_K_L | 4.26 Q4_K_S | 4.56 Q4_K_M | 4.83 Q5_K_S | 5.51 Q5_K_M | 5.67 Q6_K | 6.56

Llama 2 70B

Quantization | Bits per Weight (BPW) -- | -- Q2_K | 3.40 Q3_K_S | 3.47 Q3_K_M | 3.85 Q3_K_L | 4.19 Q4_K_S | 4.53 Q4_K_M | 4.80 Q5_K_S | 5.50 Q5_K_M | 5.65 Q6_K | 6.56

README.md

quantize

Llama 2 7B

Llama 2 13B

Llama 2 70B