1 год назад · ad52d5c259
--- a/README.md
+++ b/README.md
@@ -712,6 +712,9 @@ Building the program with BLAS support may lead to some performance improvements
 
				 
			
 
				 ### Prepare and Quantize
			
 
				 
			
 
				+> [!NOTE]
			
 
				+> You can use the [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space on Hugging Face to quantise your model weights without any setup too. It is synced from `llama.cpp` main every 6 hours.
			
 
				+
			
 
				 To obtain the official LLaMA 2 weights please see the <a href="#obtaining-and-using-the-facebook-llama-2-model">Obtaining and using the Facebook LLaMA 2 model</a> section. There is also a large selection of pre-quantized `gguf` models available on Hugging Face.
			
 
				 
			
 
				 Note: `convert.py` does not support LLaMA 3, you can use `convert-hf-to-gguf.py` with LLaMA 3 downloaded from Hugging Face.
			
--- a/examples/quantize/README.md
+++ b/examples/quantize/README.md
@@ -1,6 +1,8 @@
 
				 # quantize
			
 
				 
			
 
				-TODO
			
 
				+You can also use the [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space on Hugging Face to build your own quants without any setup.
			
 
				+
			
 
				+Note: It is synced from llama.cpp `main` every 6 hours.
			
 
				 
			
 
				 ## Llama 2 7B