cturan/llama.cpp

Autor	SHA1 Wiadomość	Data
Sigbjørn Skjæret	9008027aa3 hparams : add n_embd_inp() to support extended embed (#16928)	2 miesięcy temu
Jan Boon	d7395115ba llama : use std::abs instead of abs (#16853)	2 miesięcy temu
Sigbjørn Skjæret	f696428ce8 graph : add clamping to ffn_moe_weights_sum to avoid div-by-zero (#16655)	2 miesięcy temu
Aman Gupta	f77c13b91f CUDA: General GEMV fusion (#16715)	2 miesięcy temu
Sigbjørn Skjæret	84bf3c6778 model : add BailingMoeV2 support (#16063)	2 miesięcy temu
Georgi Gerganov	e60f241eac metal : FA support F32 K and V and head size = 32 (#16531)	3 miesięcy temu
Georgi Gerganov	e38b7c6e9e graph : support cacheless embeddings with FA and iSWA (#16528)	3 miesięcy temu
Saba Fallah	e08db42595 model: EmbeddingGemma Adding Support for SentenceTransformers Dense Modules (#16367)	3 miesięcy temu
Sigbjørn Skjæret	835b2b915c model : add GroveMoE support (#15510)	3 miesięcy temu
Aman Gupta	077c94d0ca CUDA: add a fused top-K MoE kernel (#16130)	3 miesięcy temu
Douglas Hanley	b5bd037832 llama : add support for qwen3 reranker (#15824)	3 miesięcy temu
Sigbjørn Skjæret	b8e09f08b9 model : add grok-2 support (#15539)	4 miesięcy temu
Sigbjørn Skjæret	6ab397e12b graph : support non-contiguous Q in build_attn_mha (#15908)	4 miesięcy temu
Georgi Gerganov	663027fd54 context : fix n_outputs during reserve (#15858)	4 miesięcy temu
Georgi Gerganov	c610b6c11b kv-cache : fix SWA checks + disable cacheless iSWA (#15811)	4 miesięcy temu
Daniel Bevenius	fb15d649ed llama : add support for EmbeddingGemma 300m (#15798)	4 miesięcy temu
Johannes Gäßler	e81b8e4b7f llama: use FA + max. GPU layers by default (#15434)	4 miesięcy temu
Georgi Gerganov	8a4280ce43 kv-cache : remove LLAMA_SET_ROWS checks (#15505)	4 miesięcy temu
Georgi Gerganov	0373486dbc graph : fix assert in memory-less build_attn (#15590)	4 miesięcy temu
Georgi Gerganov	3f196be84b graph : remove build_attn_with_sinks overload (#15469)	4 miesięcy temu
Georgi Gerganov	715a6db02c kv-cache : drop the "unified" prefix (#15467)	5 miesięcy temu
Georgi Gerganov	fd1234cb46 llama : add gpt-oss (#15091)	5 miesięcy temu
Sam	ef0144c087 model: support GLM 4.5 family of models (#14939)	5 miesięcy temu
Dongliang Wei	c1dacaa99b llama : merge build_moe_ffn_from_probs function into build_moe_ffn (#14968)	5 miesięcy temu
compilade	66625a59a5 graph : reduce splits for recurrent and hybrid models (#14825)	5 miesięcy temu
Douglas Hanley	a118d80233 embeddings: fix extraction of CLS pooling results (#14927)	5 miesięcy temu
Dongliang Wei	6c6e397aff model : add support for SmallThinker series (#14898)	5 miesięcy temu
Georgi Gerganov	bf9087f59a metal : fuse add, mul + add tests (#14596)	6 miesięcy temu
Georgi Gerganov	9fb1042ce6 graph : fix graph reuse reset of params (#14760)	6 miesięcy temu
Georgi Gerganov	d498af3d5a graph : avoid huge warm-up graphs for MoE models (#14753)	6 miesięcy temu

Nowsze Starsze

Historia zmian Szukaj

Historia zmian