cturan/llama.cpp

Аутор	SHA1 Порука	Датум
Atharva Dubey	14492144c2 enable dpcpp nightly builds with libraries (#13406)	пре 8 месеци
City	c104023994 mtmd : Use RMS norm for InternVL 3 38B and 78B mmproj (#13459)	пре 8 месеци
Anthony Umfer	9a390c4829 tools : fix uninitialized llama_batch in server (#13436)	пре 8 месеци
Sigbjørn Skjæret	09232370fc scripts : exit compare-llama-bench.py gracefully when there's nothing to compare (#13451)	пре 8 месеци
Johannes Gäßler	7474e00b34 CUDA: fix crash with partial offloading of MoE (#13439)	пре 8 месеци
David Huang	7f323a589f Add `--no-op-offload` to improve `-ot` pp perf in MoE models like llama4 400B (#13386)	пре 8 месеци
City	3eac209319 mtmd : support InternVL 3 38B and 78B mmproj (#13443)	пре 8 месеци
Xuan-Son Nguyen	a634d75d1b mtmd : move helpers to dedicated file (#13442)	пре 8 месеци
Thomas Germer	62d4250e52 docs : Fix typo in InternVL3 model name (#13440)	пре 8 месеци
Johannes Gäßler	0208355f42 CUDA: fix race conditions FlashAttention kernels (#13438)	пре 8 месеци
Sigbjørn Skjæret	d2a4ef05c6 vocab : add ByteDance-Seed/Seed-Coder (#13423)	пре 8 месеци
Xuan-Son Nguyen	15e6125a39 mtmd : add hard limit on image resolution for qwen2vl / qwen2.5vl (#13434)	пре 8 месеци
Xuan-Son Nguyen	3b24d26c22 server : update docs (#13432)	пре 8 месеци
Sigbjørn Skjæret	43dfd741a5 llguidance : set tokenizer slices to default (#13424)	пре 8 месеци
Thammachart Chinvarapon	b064a51a4e ci: free_disk_space flag enabled for intel variant (#13426)	пре 8 месеци
Xuan-Son Nguyen	053367d149 mtmd : support InternVL 2.5 and 3 (#13422)	пре 8 месеци
Johannes Gäßler	d8919424f1 CUDA: fix FlashAttention on Turing (#13415)	пре 8 месеци
Xuan-Son Nguyen	7fef11766c arg : add env var to control mmproj (#13416)	пре 8 месеци
Jeff Bolz	dc1d2adfc0 vulkan: scalar flash attention implementation (#13324)	пре 8 месеци
Helton Reis	7c28a74e07 chore(llguidance): use tagged version that does not break the build (#13413)	пре 8 месеци
Xuan-Son Nguyen	33eff40240 server : vision support via libmtmd (#12898)	пре 8 месеци
Alberto Cabrera Pérez	17512a94d6 sycl : implementation of reordered Q4_0 MMVQ for Intel GPUs (#12858)	пре 8 месеци
Georgi Gerganov	611aa914ef metal : optimize MoE for large batches (#13388)	пре 8 месеци
Johannes Gäßler	0cf6725e9f CUDA: FA support for Deepseek (Ampere or newer) (#13306)	пре 8 месеци
Diego Devesa	27ebfcacba llama : do not crash if there is no CPU backend (#13395)	пре 8 месеци
Johannes Gäßler	5c86c9ed3e CUDA: fix crash on large batch size for MoE models (#13384)	пре 8 месеци
Bartowski	efb8b47eda imatrix : Add --parse-special for enabling parsing of special tokens in imatrix calculation (#13389)	пре 8 месеци
R0CKSTAR	0527771dd8 llama-run: add support for downloading models from ModelScope (#13370)	пре 8 месеци
Xuan-Son Nguyen	2189fd3b63 mtmd : fix batch_view for m-rope (#13397)	пре 8 месеци
Xuan-Son Nguyen	3f96aeff39 llama : one-off chat template fix for Mistral-Small-2503 (#13398)	пре 8 месеци

Новије Старије

Историја ревизија Пронађи

Историја ревизија