Xuan-Son Nguyen 84a9bf2fc2 mtmd : merge llava, gemma3 and minicpmv CLI into single `llama-mtmd-cli` (#13012)		hai 9 meses
..
android	1c641e6aac `build`: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809)	hai 1 ano
CMakeLists.txt	84a9bf2fc2 mtmd : merge llava, gemma3 and minicpmv CLI into single `llama-mtmd-cli` (#13012)	hai 9 meses
MobileVLM-README.md	e665744317 llava : fix the script error in MobileVLM README (#9054)	hai 1 ano
README-gemma3.md	267c1399f1 common : refactor downloading system, handle mmproj with -hf option (#12694)	hai 9 meses
README-glmedge.md	0cec062a63 llama : add support for GLM-Edge and GLM-Edge-V series models (#10573)	hai 11 meses
README-granitevision.md	84d5f4bc19 Update granite vision docs for 3.2 model (#12105)	hai 10 meses
README-minicpmo2.6.md	8352cdc87b llava : fix bug in minicpm-v code (#11513)	hai 10 meses
README-minicpmv2.5.md	8352cdc87b llava : fix bug in minicpm-v code (#11513)	hai 10 meses
README-minicpmv2.6.md	8352cdc87b llava : fix bug in minicpm-v code (#11513)	hai 10 meses
README-quantize.md	1ec208083c llava: add quantization for the visual projector LLAVA, Qwen2VL (#11644)	hai 11 meses
README.md	7a2c913e66 llava : Add Granite Vision Support (#11794)	hai 10 meses
clip-impl.h	2016f07bd1 convert : experimental support for `--mmproj` flag (#13023)	hai 9 meses
clip-quantize-cli.cpp	1ec208083c llava: add quantization for the visual projector LLAVA, Qwen2VL (#11644)	hai 11 meses
clip.cpp	37b9f0d29d clip : refactor, add `image_manipulation` and `llava_uhd` classes (#13011)	hai 9 meses
clip.h	6602304814 llava: fix errors in clip.h on certain compilers (#13030)	hai 9 meses
convert_image_encoder_to_gguf.py	e9b2f84f14 llava: add big-endian conversion for image encoder (#12218)	hai 10 meses
deprecation-warning.cpp	84a9bf2fc2 mtmd : merge llava, gemma3 and minicpmv CLI into single `llama-mtmd-cli` (#13012)	hai 9 meses
gemma3_convert_encoder_to_gguf.py	7841fc723e llama : Add Gemma 3 support (+ experimental vision capability) (#12343)	hai 10 meses
glmedge-convert-image-encoder-to-gguf.py	0cec062a63 llama : add support for GLM-Edge and GLM-Edge-V series models (#10573)	hai 11 meses
glmedge-surgery.py	0cec062a63 llama : add support for GLM-Edge and GLM-Edge-V series models (#10573)	hai 11 meses
llava.cpp	0c50923944 clip : use smart pointer (⚠️ breaking change) (#12869)	hai 9 meses
llava.h	3071c0a5f2 llava : support MiniCPM-V-2.5 (#7599)	hai 1 ano
llava_surgery.py	e235b267a2 py : switch to snake_case (#8305)	hai 1 ano
llava_surgery_v2.py	7a2c913e66 llava : Add Granite Vision Support (#11794)	hai 10 meses
minicpmv-convert-image-encoder-to-gguf.py	8352cdc87b llava : fix bug in minicpm-v code (#11513)	hai 10 meses
minicpmv-surgery.py	3e3357fd77 llava : support Minicpm-omni (#11289)	hai 1 ano
mtmd-cli.cpp	84a9bf2fc2 mtmd : merge llava, gemma3 and minicpmv CLI into single `llama-mtmd-cli` (#13012)	hai 9 meses
mtmd.cpp	84a9bf2fc2 mtmd : merge llava, gemma3 and minicpmv CLI into single `llama-mtmd-cli` (#13012)	hai 9 meses
mtmd.h	b9154ecff9 mtmd : add methods to access `mtmd_image_tokens` (#12906)	hai 9 meses
qwen2_vl_surgery.py	4ddd199f6f llava : Allow locally downloaded models for QwenVL (#10833)	hai 1 ano
qwen2vl-cli.cpp	0364178ca2 clip : refactor clip_init, add tests (#12757)	hai 9 meses
requirements.txt	d3ae0ee8d7 py : fix requirements check '==' -> '~=' (#8982)	hai 1 ano
test-1.jpeg	0364178ca2 clip : refactor clip_init, add tests (#12757)	hai 9 meses
tests.sh	84a9bf2fc2 mtmd : merge llava, gemma3 and minicpmv CLI into single `llama-mtmd-cli` (#13012)	hai 9 meses

Gemma 3 vision

[!IMPORTANT]

This is very experimental, only used for demo purpose.

Quick started

You can use pre-quantized model from ggml-org's Hugging Face account

# build
cmake -B build
cmake --build build --target llama-gemma3-cli

# alternatively, install from brew (MacOS)
brew install llama.cpp

# run it
llama-gemma3-cli -hf ggml-org/gemma-3-4b-it-GGUF
llama-gemma3-cli -hf ggml-org/gemma-3-12b-it-GGUF
llama-gemma3-cli -hf ggml-org/gemma-3-27b-it-GGUF

# note: 1B model does not support vision

How to get mmproj.gguf?

cd gemma-3-4b-it
python ../llama.cpp/examples/llava/gemma3_convert_encoder_to_gguf.py .

# output file is mmproj.gguf

How to run it?

What you need:

The text model GGUF, can be converted using convert_hf_to_gguf.py
The mmproj file from step above

An image file

# build
cmake -B build
cmake --build build --target llama-gemma3-cli

# run it
./build/bin/llama-gemma3-cli -m {text_model}.gguf --mmproj mmproj.gguf --image your_image.jpg

README-gemma3.md

Gemma 3 vision

Quick started

How to get mmproj.gguf?

How to run it?