Johannes Gäßler
|
e789095502
llama: print memory breakdown on exit (#15860)
|
3 months ago |
Georgi Gerganov
|
b730706a49
kv-cache : support layer reuse (#15504)
|
4 months ago |
Georgi Gerganov
|
9ebebef62f
llama : remove KV cache defragmentation logic (#15473)
|
5 months ago |
Georgi Gerganov
|
715a6db02c
kv-cache : drop the "unified" prefix (#15467)
|
5 months ago |
Georgi Gerganov
|
d32e03f449
server : add SWA checkpoints (#15293)
|
5 months ago |
Georgi Gerganov
|
745f11fed0
memory : correctly handle failure in apply() (#14438)
|
6 months ago |
Georgi Gerganov
|
692e3cdd0a
memory : rename interface to llama_memory_context_i (#14296)
|
7 months ago |
Georgi Gerganov
|
4c9fdfbe15
ubatch : new splitting logic (#14217)
|
7 months ago |
Georgi Gerganov
|
d3e64b9f49
llama : rework embeddings logic (#14208)
|
7 months ago |
Georgi Gerganov
|
c3ee46fab4
batch : remove logits_all flag (#14141)
|
7 months ago |
Georgi Gerganov
|
745aa5319b
llama : deprecate llama_kv_self_ API (#14030)
|
7 months ago |
Georgi Gerganov
|
7f37b6cf1e
memory : migrate from llama_kv_cache to more generic llama_memory (#14006)
|
7 months ago |
Georgi Gerganov
|
3e63a58ef7
kv-cache : refactor the update/defrag mechanism (#13988)
|
7 months ago |
Georgi Gerganov
|
12d0188c0d
kv-cache : refactor + add llama_memory_state_i (#13746)
|
7 months ago |
Georgi Gerganov
|
de2ef53a4b
kv-cache : rework kv_cell (#13706)
|
7 months ago |
Georgi Gerganov
|
e298d2fbd0
kv-cache : add SWA support (#13194)
|
8 months ago |
Georgi Gerganov
|
c642bc014c
kv-cache : separate recurrent vs non-recurrent impl (#12799)
|
8 months ago |
Georgi Gerganov
|
3e1d29348b
kv-cache : simplify + fix warning for recurrent models (#12756)
|
9 months ago |
Georgi Gerganov
|
e0dbec0bc6
llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181)
|
10 months ago |