cturan/llama.cpp

Author	SHA1 Message	Date
Sigbjørn Skjæret	b25346221d llama : return mistral-v7-tekken as default template only (#14390)	6 months ago
Georgi Gerganov	e8215dbb96 metal : add special-case mat-vec mul for ne00 == 4 (#14385)	6 months ago
Georgi Gerganov	5783ae4359 metal : batch rows copy in a single threadgroup (#14384)	6 months ago
Aaron Teo	bf5bcd0b85 docs: update s390x documentation + add faq (#14389)	6 months ago
R0CKSTAR	716301d1b0 musa: enable fp16 mma (all) and cublas on qy2 (#13842)	6 months ago
Aaron Teo	60ef23d6c1 ggml-cpu: enable IBM NNPA Vector Intrinsics (#14317)	6 months ago
Sigbjørn Skjæret	b193d53069 ggml : do not output unprintable characters on GGUF load failure (#14381)	6 months ago
Anton Mitkov	2bf9d539dd sycl: GGML_SYCL_DISABLE_OPT on by default for all Intel Devices (#13973)	6 months ago
lhez	73e53dc834 opencl: ref count `ggml_backend_opencl_context` and refactor profiling (#14254)	7 months ago
Georgi Gerganov	62af464227 batch : fix check for empty sequences in memory (#14364)	7 months ago
Mathieu Baudier	c148cf1946 cmake : use LLAMA_BUILD_NUMBER when defining LLAMA_INSTALL_VERSION (#14362)	7 months ago
Nigel Bosch	1b809cee22 server : move no API key doc to /health (#14352)	7 months ago
Sigbjørn Skjæret	abf241045d main : honor --verbose-prompt on interactive prompts (#14350)	7 months ago
Bartowski	901e20bbe5 jinja : Add Mistral-Small-3.2-24B-Instruct-2506.jinja (#14349)	7 months ago
uvos	0142961a2e CUDA/HIP: optimize mmv paths taken for HIP devices (#14324)	7 months ago
bandoti	ce82bd0117 ci: add workflow for relocatable cmake package (#14346)	7 months ago
Jeff Bolz	bf2a99e3cb vulkan: update windows SDK in release.yml (#14344)	7 months ago
Molly Sophia	72c6bc3f3d llama : better rwkv chat template and add missing `inputs.use_jinja` setting (#14336)	7 months ago
Johannes Gäßler	defe2158dd CUDA: mul_mat_v support for batch sizes > 1 (#14262)	7 months ago
Georgi Gerganov	7b50d589a8 kv-cells : fix tracking of seq_pos (#14339)	7 months ago
Jeff Bolz	3a9457df96 vulkan: update windows SDK in CI (#14334)	7 months ago
Ed Addario	fa4a9f2a1c quantize : handle user-defined pruning of whole layers (blocks) (#13037)	7 months ago
Sigbjørn Skjæret	238005c2dc gguf-py : fix SpecialVocab parsing when post_processor is null (#14330)	7 months ago
Ruikai Peng	66aba7aca9 run : avoid double tokenization (#14327)	7 months ago
Georgi Gerganov	f1f5e82df6 examples : fix is_first logic for tokenization (#14329)	7 months ago
uvos	af3373f1ad HIP: enable vec fattn on RDNA4 (#14323)	7 months ago
yuiseki	5d5c066de8 mtmd : fix Pixtral OOM with large images by capping image_size to 1024 (#14326)	7 months ago
Sigbjørn Skjæret	40bfa04c95 common : use std::string_view now that we target c++17 (#14319)	7 months ago
Aman Gupta	aa064b2eb7 CUDA: add mean operation (#14313)	7 months ago
Sigbjørn Skjæret	aa0ef5c578 gguf-py : fix Qwen3-Embedding eos token (#14314)	7 months ago

Newer Older

Commit History Find

Commit History