Johannes Gäßler
|
e789095502
llama: print memory breakdown on exit (#15860)
|
4 months ago |
Georgi Gerganov
|
c610b6c11b
kv-cache : fix SWA checks + disable cacheless iSWA (#15811)
|
4 months ago |
Daniel Bevenius
|
fb15d649ed
llama : add support for EmbeddingGemma 300m (#15798)
|
4 months ago |
Georgi Gerganov
|
b730706a49
kv-cache : support layer reuse (#15504)
|
5 months ago |
Georgi Gerganov
|
715a6db02c
kv-cache : drop the "unified" prefix (#15467)
|
5 months ago |
Georgi Gerganov
|
d32e03f449
server : add SWA checkpoints (#15293)
|
5 months ago |
compilade
|
11a3811164
memory : handle kv_unified for hybrid models (#15050)
|
5 months ago |
Georgi Gerganov
|
a70c8a0c4b
kv-cache : use ggml_set_rows (#14285)
|
6 months ago |
Georgi Gerganov
|
692e3cdd0a
memory : rename interface to llama_memory_context_i (#14296)
|
7 months ago |
Georgi Gerganov
|
4c9fdfbe15
ubatch : new splitting logic (#14217)
|
7 months ago |
Gabe Goodhart
|
edc4a29eff
memory : Hybrid recurrent cache (#13979)
|
7 months ago |