Commit History

Author SHA1 Message Date
  Georgi Gerganov 225e7a1438 llama : add high-throughput mode (#14363) 6 months ago
  Georgi Gerganov 67d1ef23c6 batch : add optional for sequential equal split (#14511) 7 months ago
  Georgi Gerganov c79184d2d1 batch : add n_used count (#14512) 7 months ago
  Georgi Gerganov a70c8a0c4b kv-cache : use ggml_set_rows (#14285) 7 months ago
  Georgi Gerganov 745f11fed0 memory : correctly handle failure in apply() (#14438) 7 months ago
  Georgi Gerganov 692e3cdd0a memory : rename interface to llama_memory_context_i (#14296) 7 months ago
  Georgi Gerganov 4c9fdfbe15 ubatch : new splitting logic (#14217) 7 months ago
  Gabe Goodhart edc4a29eff memory : Hybrid recurrent cache (#13979) 7 months ago
  Georgi Gerganov d3e64b9f49 llama : rework embeddings logic (#14208) 7 months ago
  Georgi Gerganov c3ee46fab4 batch : remove logits_all flag (#14141) 7 months ago
  Georgi Gerganov 9596506965 kv-cache : fix split_equal handling in unified implementation (#14130) 7 months ago
  Georgi Gerganov 745aa5319b llama : deprecate llama_kv_self_ API (#14030) 8 months ago
  Georgi Gerganov 3e63a58ef7 kv-cache : refactor the update/defrag mechanism (#13988) 8 months ago
  Georgi Gerganov 0fc16b42e8 kv-cache : split implementation in separate sources (#13920) 8 months ago