7 ay önce · ea1431b0fa
--- a/README.md
+++ b/README.md
@@ -28,6 +28,30 @@ Inference of Meta's [LLaMA](https://arxiv.org/abs/2302.13971) model (and others)
 
				 
			
 
				 ----
			
 
				 
			
 
				+## Quick start
			
 
				+
			
 
				+Getting started with llama.cpp is straightforward. Here are several ways to install it on your machine:
			
 
				+
			
 
				+- Install `llama.cpp` using [brew, nix or winget](docs/install.md)
			
 
				+- Run with Docker - see our [Docker documentation](docs/docker.md)
			
 
				+- Download pre-built binaries from the [releases page](https://github.com/ggml-org/llama.cpp/releases)
			
 
				+- Build from source by cloning this repository - check out [our build guide](docs/build.md)
			
 
				+
			
 
				+Once installed, you'll need a model to work with. Head to the [Obtaining and quantizing models](#obtaining-and-quantizing-models) section to learn more.
			
 
				+
			
 
				+Example command:
			
 
				+
			
 
				+```sh
			
 
				+# Use a local model file
			
 
				+llama-cli -m my_model.gguf
			
 
				+
			
 
				+# Or download and run a model directly from Hugging Face
			
 
				+llama-cli -hf ggml-org/gemma-3-1b-it-GGUF
			
 
				+
			
 
				+# Launch OpenAI-compatible API server
			
 
				+llama-server -hf ggml-org/gemma-3-1b-it-GGUF
			
 
				+```
			
 
				+
			
 
				 ## Description
			
 
				 
			
 
				 The main goal of `llama.cpp` is to enable LLM inference with minimal setup and state-of-the-art performance on a wide
			
@@ -230,6 +254,7 @@ Instructions for adding support for new models: [HOWTO-add-model.md](docs/develo
 
				 
			
 
				 </details>
			
 
				 
			
 
				+
			
 
				 ## Supported backends
			
 
				 
			
 
				 | Backend | Target devices |
			
@@ -246,16 +271,6 @@ Instructions for adding support for new models: [HOWTO-add-model.md](docs/develo
 
				 | [OpenCL](docs/backend/OPENCL.md) | Adreno GPU |
			
 
				 | [RPC](https://github.com/ggml-org/llama.cpp/tree/master/tools/rpc) | All |
			
 
				 
			
 
				-## Building the project
			
 
				-
			
 
				-The main product of this project is the `llama` library. Its C-style interface can be found in [include/llama.h](include/llama.h).
			
 
				-The project also includes many example programs and tools using the `llama` library. The examples range from simple, minimal code snippets to sophisticated sub-projects such as an OpenAI-compatible HTTP server. Possible methods for obtaining the binaries:
			
 
				-
			
 
				-- Clone this repository and build locally, see [how to build](docs/build.md)
			
 
				-- On MacOS or Linux, install `llama.cpp` via [brew, flox or nix](docs/install.md)
			
 
				-- Use a Docker image, see [documentation for Docker](docs/docker.md)
			
 
				-- Download pre-built binaries from [releases](https://github.com/ggml-org/llama.cpp/releases)
			
 
				-
			
 
				 ## Obtaining and quantizing models
			
 
				 
			
 
				 The [Hugging Face](https://huggingface.co) platform hosts a [number of LLMs](https://huggingface.co/models?library=gguf&sort=trending) compatible with `llama.cpp`:
			
@@ -263,7 +278,11 @@ The [Hugging Face](https://huggingface.co) platform hosts a [number of LLMs](htt
 
				 - [Trending](https://huggingface.co/models?library=gguf&sort=trending)
			
 
				 - [LLaMA](https://huggingface.co/models?sort=trending&search=llama+gguf)
			
 
				 
			
 
				-You can either manually download the GGUF file or directly use any `llama.cpp`-compatible models from [Hugging Face](https://huggingface.co/) or other model hosting sites, such as [ModelScope](https://modelscope.cn/), by using this CLI argument: `-hf <user>/<model>[:quant]`.
			
 
				+You can either manually download the GGUF file or directly use any `llama.cpp`-compatible models from [Hugging Face](https://huggingface.co/) or other model hosting sites, such as [ModelScope](https://modelscope.cn/), by using this CLI argument: `-hf <user>/<model>[:quant]`. For example:
			
 
				+
			
 
				+```sh
			
 
				+llama-cli -hf ggml-org/gemma-3-1b-it-GGUF
			
 
				+```
			
 
				 
			
 
				 By default, the CLI would download from Hugging Face, you can switch to other options with the environment variable `MODEL_ENDPOINT`. For example, you may opt to downloading model checkpoints from ModelScope or other model sharing communities by setting the environment variable, e.g. `MODEL_ENDPOINT=https://www.modelscope.cn/`.
			
 
				 
			
--- a/docs/build.md
+++ b/docs/build.md
@@ -1,5 +1,9 @@
 
				 # Build llama.cpp locally
			
 
				 
			
 
				+The main product of this project is the `llama` library. Its C-style interface can be found in [include/llama.h](include/llama.h).
			
 
				+
			
 
				+The project also includes many example programs and tools using the `llama` library. The examples range from simple, minimal code snippets to sophisticated sub-projects such as an OpenAI-compatible HTTP server.
			
 
				+
			
 
				 **To get the Code:**
			
 
				 
			
 
				 ```bash
			
--- a/docs/install.md
+++ b/docs/install.md
@@ -1,28 +1,42 @@
 
				 # Install pre-built version of llama.cpp
			
 
				 
			
 
				-## Homebrew
			
 
				+| Install via | Windows | Mac | Linux |
			
 
				+|-------------|---------|-----|-------|
			
 
				+| Winget      | ✅      |      |      |
			
 
				+| Homebrew    |         | ✅   | ✅   |
			
 
				+| MacPorts    |         | ✅   |      |
			
 
				+| Nix         |         | ✅   | ✅   |
			
 
				 
			
 
				-On Mac and Linux, the homebrew package manager can be used via
			
 
				+## Winget (Windows)
			
 
				+
			
 
				+```sh
			
 
				+winget install llama.cpp
			
 
				+```
			
 
				+
			
 
				+The package is automatically updated with new `llama.cpp` releases. More info: https://github.com/ggml-org/llama.cpp/issues/8188
			
 
				+
			
 
				+## Homebrew (Mac and Linux)
			
 
				 
			
 
				 ```sh
			
 
				 brew install llama.cpp
			
 
				 ```
			
 
				+
			
 
				 The formula is automatically updated with new `llama.cpp` releases. More info: https://github.com/ggml-org/llama.cpp/discussions/7668
			
 
				 
			
 
				-## MacPorts
			
 
				+## MacPorts (Mac)
			
 
				 
			
 
				 ```sh
			
 
				 sudo port install llama.cpp
			
 
				 ```
			
 
				-see also: https://ports.macports.org/port/llama.cpp/details/
			
 
				 
			
 
				-## Nix
			
 
				+See also: https://ports.macports.org/port/llama.cpp/details/
			
 
				 
			
 
				-On Mac and Linux, the Nix package manager can be used via
			
 
				+## Nix (Mac and Linux)
			
 
				 
			
 
				 ```sh
			
 
				 nix profile install nixpkgs#llama-cpp
			
 
				 ```
			
 
				+
			
 
				 For flake enabled installs.
			
 
				 
			
 
				 Or
			
@@ -34,13 +48,3 @@ nix-env --file '<nixpkgs>' --install --attr llama-cpp
 
				 For non-flake enabled installs.
			
 
				 
			
 
				 This expression is automatically updated within the [nixpkgs repo](https://github.com/NixOS/nixpkgs/blob/nixos-24.05/pkgs/by-name/ll/llama-cpp/package.nix#L164).
			
 
				-
			
 
				-## Flox
			
 
				-
			
 
				-On Mac and Linux, Flox can be used to install llama.cpp within a Flox environment via
			
 
				-
			
 
				-```sh
			
 
				-flox install llama-cpp
			
 
				-```
			
 
				-
			
 
				-Flox follows the nixpkgs build of llama.cpp.