1 an în urmă · 943d20b411
--- a/README.md
+++ b/README.md
@@ -31,7 +31,7 @@ variety of hardware - locally and in the cloud.
 
				 - Apple silicon is a first-class citizen - optimized via ARM NEON, Accelerate and Metal frameworks
			
 
				 - AVX, AVX2 and AVX512 support for x86 architectures
			
 
				 - 1.5-bit, 2-bit, 3-bit, 4-bit, 5-bit, 6-bit, and 8-bit integer quantization for faster inference and reduced memory use
			
 
				-- Custom CUDA kernels for running LLMs on NVIDIA GPUs (support for AMD GPUs via HIP)
			
 
				+- Custom CUDA kernels for running LLMs on NVIDIA GPUs (support for AMD GPUs via HIP and Moore Threads MTT GPUs via MUSA)
			
 
				 - Vulkan and SYCL backend support
			
 
				 - CPU+GPU hybrid inference to partially accelerate models larger than the total VRAM capacity
			
 
				 
			
@@ -413,7 +413,7 @@ Please refer to [Build llama.cpp locally](./docs/build.md)
 
				 | [BLAS](./docs/build.md#blas-build) | All |
			
 
				 | [BLIS](./docs/backend/BLIS.md) | All |
			
 
				 | [SYCL](./docs/backend/SYCL.md) | Intel and Nvidia GPU |
			
 
				-| [MUSA](./docs/build.md#musa) | Moore Threads GPU |
			
 
				+| [MUSA](./docs/build.md#musa) | Moore Threads MTT GPU |
			
 
				 | [CUDA](./docs/build.md#cuda) | Nvidia GPU |
			
 
				 | [hipBLAS](./docs/build.md#hipblas) | AMD GPU |
			
 
				 | [Vulkan](./docs/build.md#vulkan) | GPU |
			
--- a/docs/build.md
+++ b/docs/build.md
@@ -198,6 +198,8 @@ The following compilation options are also available to tweak performance:
 
				 
			
 
				 ### MUSA
			
 
				 
			
 
				+This provides GPU acceleration using the MUSA cores of your Moore Threads MTT GPU. Make sure to have the MUSA SDK installed. You can download it from here: [MUSA SDK](https://developer.mthreads.com/sdk/download/musa).
			
 
				+
			
 
				 - Using `make`:
			
 
				   ```bash
			
 
				   make GGML_MUSA=1
			
@@ -209,6 +211,12 @@ The following compilation options are also available to tweak performance:
 
				   cmake --build build --config Release
			
 
				   ```
			
 
				 
			
 
				+The environment variable [`MUSA_VISIBLE_DEVICES`](https://docs.mthreads.com/musa-sdk/musa-sdk-doc-online/programming_guide/Z%E9%99%84%E5%BD%95/) can be used to specify which GPU(s) will be used.
			
 
				+
			
 
				+The environment variable `GGML_CUDA_ENABLE_UNIFIED_MEMORY=1` can be used to enable unified memory in Linux. This allows swapping to system RAM instead of crashing when the GPU VRAM is exhausted.
			
 
				+
			
 
				+Most of the compilation options available for CUDA should also be available for MUSA, though they haven't been thoroughly tested yet.
			
 
				+
			
 
				 ### hipBLAS
			
 
				 
			
 
				 This provides BLAS acceleration on HIP-supported AMD GPUs.