Xuan-Son Nguyen
|
e3af5563bd
llama: store mrope data in KV cell (#16825)
|
3 mesiacov pred |
Georgi Gerganov
|
9ebebef62f
llama : remove KV cache defragmentation logic (#15473)
|
5 mesiacov pred |
Georgi Gerganov
|
715a6db02c
kv-cache : drop the "unified" prefix (#15467)
|
5 mesiacov pred |
Georgi Gerganov
|
a70c8a0c4b
kv-cache : use ggml_set_rows (#14285)
|
6 mesiacov pred |
Georgi Gerganov
|
7b50d589a8
kv-cells : fix tracking of seq_pos (#14339)
|
7 mesiacov pred |
Georgi Gerganov
|
4c9fdfbe15
ubatch : new splitting logic (#14217)
|
7 mesiacov pred |
Georgi Gerganov
|
c311ac664d
cparams : rename LLAMA_MAX_PARALLEL_SEQUENCES to LLAMA_MAX_SEQ (#14188)
|
7 mesiacov pred |
Georgi Gerganov
|
40cbf571c9
kv-cache : fix shift and defrag logic (#14081)
|
7 mesiacov pred |
Georgi Gerganov
|
12d0188c0d
kv-cache : refactor + add llama_memory_state_i (#13746)
|
8 mesiacov pred |
Georgi Gerganov
|
81713121ee
kv-cells : track min/max used cells and per-sequence positions (#13808)
|
8 mesiacov pred |
Georgi Gerganov
|
de2ef53a4b
kv-cache : rework kv_cell (#13706)
|
8 mesiacov pred |