cturan/llama.cpp

Author	SHA1 Message	Date
Georgi Gerganov	c560316440 graph : reuse SSM graphs (#16490)	1 month ago
Daniel Bevenius	2995341730 llama : add support for NVIDIA Nemotron 3 Nano (#18058)	1 month ago
Xuan-Son Nguyen	0759b09c90 graph: add f_attn_temp_offset (#18025)	1 month ago
Georgi Gerganov	609a2d0268 models : fix YaRN regression + consolidate logic (#18006)	1 month ago
Georgi Gerganov	7bed317f53 models : fix the attn_factor for mistral3 graphs + improve consistency (#17945)	1 month ago
Georgi Gerganov	4dff236a52 ggml : remove GGML_KQ_MASK_PAD constant (#17910)	1 month ago
Sigbjørn Skjæret	c8554b66e0 graph : use fill instead of scale_bias in grouped expert selection (#17867)	1 month ago
Xuan-Son Nguyen	cd3c118908 model: support Ministral3 (#17644)	2 months ago
Aman Gupta	6eea666912 llama-graph: avoid expand_forward for fusion (#17633)	2 months ago
Georgi Gerganov	583cb83416 ggml : add ggml_top_k (#17365)	2 months ago
Aman Gupta	a90eb94ca9 CUDA: fuse rope + set_rows (#16884)	2 months ago
Sigbjørn Skjæret	9008027aa3 hparams : add n_embd_inp() to support extended embed (#16928)	2 months ago
Jan Boon	d7395115ba llama : use std::abs instead of abs (#16853)	3 months ago
Sigbjørn Skjæret	f696428ce8 graph : add clamping to ffn_moe_weights_sum to avoid div-by-zero (#16655)	3 months ago
Aman Gupta	f77c13b91f CUDA: General GEMV fusion (#16715)	3 months ago
Sigbjørn Skjæret	84bf3c6778 model : add BailingMoeV2 support (#16063)	3 months ago
Georgi Gerganov	e60f241eac metal : FA support F32 K and V and head size = 32 (#16531)	3 months ago
Georgi Gerganov	e38b7c6e9e graph : support cacheless embeddings with FA and iSWA (#16528)	3 months ago
Saba Fallah	e08db42595 model: EmbeddingGemma Adding Support for SentenceTransformers Dense Modules (#16367)	3 months ago
Sigbjørn Skjæret	835b2b915c model : add GroveMoE support (#15510)	4 months ago
Aman Gupta	077c94d0ca CUDA: add a fused top-K MoE kernel (#16130)	4 months ago
Douglas Hanley	b5bd037832 llama : add support for qwen3 reranker (#15824)	4 months ago
Sigbjørn Skjæret	b8e09f08b9 model : add grok-2 support (#15539)	4 months ago
Sigbjørn Skjæret	6ab397e12b graph : support non-contiguous Q in build_attn_mha (#15908)	4 months ago
Georgi Gerganov	663027fd54 context : fix n_outputs during reserve (#15858)	4 months ago
Georgi Gerganov	c610b6c11b kv-cache : fix SWA checks + disable cacheless iSWA (#15811)	4 months ago
Daniel Bevenius	fb15d649ed llama : add support for EmbeddingGemma 300m (#15798)	4 months ago
Johannes Gäßler	e81b8e4b7f llama: use FA + max. GPU layers by default (#15434)	5 months ago
Georgi Gerganov	8a4280ce43 kv-cache : remove LLAMA_SET_ROWS checks (#15505)	5 months ago
Georgi Gerganov	0373486dbc graph : fix assert in memory-less build_attn (#15590)	5 months ago

Newer Older

Commit History Find

Commit History