Georgi Gerganov
|
b730706a49
kv-cache : support layer reuse (#15504)
|
5 hónapja |
Piotr Wilkin (ilintar)
|
b1afcab804
model : add support for Seed-OSS (#15490)
|
5 hónapja |
Tarek Dakhran
|
e288693669
readme : model : mtdm : lfm2 improvements (#15476)
|
5 hónapja |
Georgi Gerganov
|
3f196be84b
graph : remove build_attn_with_sinks overload (#15469)
|
5 hónapja |
Georgi Gerganov
|
715a6db02c
kv-cache : drop the "unified" prefix (#15467)
|
5 hónapja |
Georgi Gerganov
|
9ef6b0b835
model : add gpt-oss type strings (#15424)
|
5 hónapja |
Sigbjørn Skjæret
|
baa9255a45
llama : merge conts and reshapes and remove unnecessary cont (#15380)
|
5 hónapja |
Daniel Bevenius
|
7a0de96045
llama : add 18-layer model type for Gemma 3-270m (#15319)
|
5 hónapja |
Georgi Gerganov
|
fd1234cb46
llama : add gpt-oss (#15091)
|
6 hónapja |
Juk Armstrong
|
c81de6e107
Fix `glm4moe` bug (#15088)
|
6 hónapja |
Sam
|
ef0144c087
model: support GLM 4.5 family of models (#14939)
|
6 hónapja |
compilade
|
11a3811164
memory : handle kv_unified for hybrid models (#15050)
|
6 hónapja |
Douglas Hanley
|
339bd0268c
model : support Qwen3-Embedding (#15023)
|
6 hónapja |
stevenkuang
|
0f5ccd6fd1
model : add hunyuan dense (#14878)
|
6 hónapja |
Diego Devesa
|
d6818d06a6
llama : allow other bufts when overriding to CPU, add --no-repack option (#14990)
|
6 hónapja |
Dongliang Wei
|
c1dacaa99b
llama : merge build_moe_ffn_from_probs function into build_moe_ffn (#14968)
|
6 hónapja |
Aman Gupta
|
8a4a856277
Add LLaDA 8b Diffusion model (#14771)
|
6 hónapja |
Dongliang Wei
|
6c6e397aff
model : add support for SmallThinker series (#14898)
|
6 hónapja |
Gabriel Larson
|
4762ad7316
model : make rope_yarn_log_mul optional for deepseek2 (#14896)
|
6 hónapja |
Shunta Saito
|
1dc9614e06
llama : fix kq_scale for the attention layers of PLaMo2 (#14892)
|
6 hónapja |
yummy
|
86f5623d90
llama : fix MiniCPM inference after Granite Four changes (#14850)
|
6 hónapja |
Molly Sophia
|
d4d1522b20
llama : add model type detection for rwkv7 7B&14B (#14816)
|
6 hónapja |
Georgi Gerganov
|
eacdeb5bfc
model : fix build after merge conflict (#14754)
|
6 hónapja |
lgai-exaone
|
e0cb5c5cb8
model : add EXAONE 4.0 support (#14630)
|
6 hónapja |
Georgi Gerganov
|
8f974bc1e9
graph : refactor context to not pass gf explicitly (#14629)
|
6 hónapja |
Piotr Wilkin (ilintar)
|
cb887f1bc1
model: add Ernie 4.5 MoE support (#14658)
|
6 hónapja |
Georgi Gerganov
|
01612b7409
llama : reuse compute graphs (#14482)
|
6 hónapja |
Tarek Dakhran
|
086cf81e88
llama : fix parallel processing for lfm2 (#14705)
|
6 hónapja |
tempstudio
|
b0f0ecc3dc
model : support output bias for qwen2 (#14711)
|
6 hónapja |
Georgi Gerganov
|
225e7a1438
llama : add high-throughput mode (#14363)
|
6 hónapja |