ddh0
|
f6dcda3900
server : context checkpointing for hybrid and recurrent models (#16382)
|
3 months ago |
Johannes Gäßler
|
e789095502
llama: print memory breakdown on exit (#15860)
|
4 months ago |
Georgi Gerganov
|
c610b6c11b
kv-cache : fix SWA checks + disable cacheless iSWA (#15811)
|
4 months ago |
Daniel Bevenius
|
fb15d649ed
llama : add support for EmbeddingGemma 300m (#15798)
|
4 months ago |
Georgi Gerganov
|
b730706a49
kv-cache : support layer reuse (#15504)
|
5 months ago |
Georgi Gerganov
|
715a6db02c
kv-cache : drop the "unified" prefix (#15467)
|
5 months ago |