Commit History

Author SHA1 Message Date
  Johannes Gäßler e789095502 llama: print memory breakdown on exit (#15860) 3 months ago
  Georgi Gerganov b730706a49 kv-cache : support layer reuse (#15504) 4 months ago
  Georgi Gerganov 9ebebef62f llama : remove KV cache defragmentation logic (#15473) 5 months ago
  Georgi Gerganov 715a6db02c kv-cache : drop the "unified" prefix (#15467) 5 months ago
  Georgi Gerganov d32e03f449 server : add SWA checkpoints (#15293) 5 months ago
  Georgi Gerganov 745f11fed0 memory : correctly handle failure in apply() (#14438) 6 months ago
  Georgi Gerganov 692e3cdd0a memory : rename interface to llama_memory_context_i (#14296) 7 months ago
  Georgi Gerganov 4c9fdfbe15 ubatch : new splitting logic (#14217) 7 months ago
  Georgi Gerganov d3e64b9f49 llama : rework embeddings logic (#14208) 7 months ago
  Georgi Gerganov c3ee46fab4 batch : remove logits_all flag (#14141) 7 months ago
  Georgi Gerganov 745aa5319b llama : deprecate llama_kv_self_ API (#14030) 7 months ago
  Georgi Gerganov 7f37b6cf1e memory : migrate from llama_kv_cache to more generic llama_memory (#14006) 7 months ago
  Georgi Gerganov 3e63a58ef7 kv-cache : refactor the update/defrag mechanism (#13988) 7 months ago
  Georgi Gerganov 12d0188c0d kv-cache : refactor + add llama_memory_state_i (#13746) 7 months ago
  Georgi Gerganov de2ef53a4b kv-cache : rework kv_cell (#13706) 7 months ago
  Georgi Gerganov e298d2fbd0 kv-cache : add SWA support (#13194) 8 months ago
  Georgi Gerganov c642bc014c kv-cache : separate recurrent vs non-recurrent impl (#12799) 8 months ago
  Georgi Gerganov 3e1d29348b kv-cache : simplify + fix warning for recurrent models (#12756) 9 months ago
  Georgi Gerganov e0dbec0bc6 llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181) 10 months ago