cturan/llama.cpp @ d7d1eccacccaa698c9232014b96a82b359595d6e

tc-mb 3e3357fd77 llava : support Minicpm-omni (#11289)		1 year ago
..
android	1c641e6aac `build`: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809)	1 year ago
CMakeLists.txt	ba1cb19cdd llama : add Qwen2VL support + multimodal RoPE (#10361)	1 year ago
MobileVLM-README.md	e665744317 llava : fix the script error in MobileVLM README (#9054)	1 year ago
README-minicpmo2.6.md	3e3357fd77 llava : support Minicpm-omni (#11289)	1 year ago
README-minicpmv2.5.md	3246fe84d7 Fix minicpm example directory (#9111)	1 year ago
README-minicpmv2.6.md	d565bb2fd5 llava : support MiniCPM-V-2.6 (#8967)	1 year ago
README.md	e235b267a2 py : switch to snake_case (#8305)	1 year ago
clip.cpp	3e3357fd77 llava : support Minicpm-omni (#11289)	1 year ago
clip.h	ba1cb19cdd llama : add Qwen2VL support + multimodal RoPE (#10361)	1 year ago
convert_image_encoder_to_gguf.py	511636df0c ci : reduce severity of unused Pyright ignore comments (#9697)	1 year ago
llava-cli.cpp	afa8a9ec9b llama : add `llama_vocab`, functions -> methods, naming (#11110)	1 year ago
llava.cpp	3e3357fd77 llava : support Minicpm-omni (#11289)	1 year ago
llava.h	3071c0a5f2 llava : support MiniCPM-V-2.5 (#7599)	1 year ago
llava_surgery.py	e235b267a2 py : switch to snake_case (#8305)	1 year ago
llava_surgery_v2.py	3fd62a6b1c py : type-check all Python scripts with Pyright (#8341)	1 year ago
minicpmv-cli.cpp	3e3357fd77 llava : support Minicpm-omni (#11289)	1 year ago
minicpmv-convert-image-encoder-to-gguf.py	3e3357fd77 llava : support Minicpm-omni (#11289)	1 year ago
minicpmv-surgery.py	3e3357fd77 llava : support Minicpm-omni (#11289)	1 year ago
qwen2_vl_surgery.py	4ddd199f6f llava : Allow locally downloaded models for QwenVL (#10833)	1 year ago
qwen2vl-cli.cpp	afa8a9ec9b llama : add `llama_vocab`, functions -> methods, naming (#11110)	1 year ago
requirements.txt	d3ae0ee8d7 py : fix requirements check '==' -> '~=' (#8982)	1 year ago

MiniCPM-o 2.6

Currently, this readme only supports minicpm-omni's image capabilities, and we will update the full-mode support as soon as possible.

Prepare models and code

Download MiniCPM-o-2_6 PyTorch model from huggingface to "MiniCPM-o-2_6" folder.

Clone llama.cpp:

git clone git@github.com:OpenBMB/llama.cpp.git
cd llama.cpp
git checkout minicpm-omni

Usage of MiniCPM-o 2.6

Convert PyTorch model to gguf files (You can also download the converted gguf by us)

python ./examples/llava/minicpmv-surgery.py -m ../MiniCPM-o-2_6
python ./examples/llava/minicpmv-convert-image-encoder-to-gguf.py -m ../MiniCPM-o-2_6 --minicpmv-projector ../MiniCPM-o-2_6/minicpmv.projector --output-dir ../MiniCPM-o-2_6/ --image-mean 0.5 0.5 0.5 --image-std 0.5 0.5 0.5 --minicpmv_version 4
python ./convert_hf_to_gguf.py ../MiniCPM-o-2_6/model

# quantize int4 version
./llama-quantize ../MiniCPM-o-2_6/model/ggml-model-f16.gguf ../MiniCPM-o-2_6/model/ggml-model-Q4_K_M.gguf Q4_K_M

Build llama.cpp using CMake: https://github.com/ggerganov/llama.cpp/blob/master/docs/build.md

cmake -B build
cmake --build build --config Release

Inference on Linux or Mac

# run f16 version
./llama-minicpmv-cli -m ../MiniCPM-o-2_6/model/ggml-model-f16.gguf --mmproj ../MiniCPM-o-2_6/mmproj-model-f16.gguf -c 4096 --temp 0.7 --top-p 0.8 --top-k 100 --repeat-penalty 1.05 --image xx.jpg -p "What is in the image?"

# run quantized int4 version
./llama-minicpmv-cli -m ../MiniCPM-o-2_6/model/ggml-model-Q4_K_M.gguf --mmproj ../MiniCPM-o-2_6/mmproj-model-f16.gguf -c 4096 --temp 0.7 --top-p 0.8 --top-k 100 --repeat-penalty 1.05 --image xx.jpg  -p "What is in the image?"

# or run in interactive mode
./llama-minicpmv-cli -m ../MiniCPM-o-2_6/model/ggml-model-Q4_K_M.gguf --mmproj ../MiniCPM-o-2_6/mmproj-model-f16.gguf -c 4096 --temp 0.7 --top-p 0.8 --top-k 100 --repeat-penalty 1.05 --image xx.jpg -i

README-minicpmo2.6.md

MiniCPM-o 2.6

Prepare models and code

Usage of MiniCPM-o 2.6