Commit History

Autor SHA1 Mensaxe Data
  Johannes Gäßler b1f3a6e5db llama: automatically set parameters not set by the user in such a way that maximizes GPU utilization (#16653) hai 1 mes
  Georgi Gerganov 609a2d0268 models : fix YaRN regression + consolidate logic (#18006) hai 1 mes
  Georgi Gerganov 7bed317f53 models : fix the attn_factor for mistral3 graphs + improve consistency (#17945) hai 1 mes
  Georgi Gerganov 4dff236a52 ggml : remove GGML_KQ_MASK_PAD constant (#17910) hai 1 mes
  JJJYmmm d261223d24 model: add support for qwen3vl series (#16780) hai 2 meses
  Xuan-Son Nguyen e3af5563bd llama: store mrope data in KV cell (#16825) hai 2 meses
  Georgi Gerganov 85a7d8677b memory : remove KV cache size padding (#16812) hai 2 meses
  Johannes Gäßler 7a0e900e36 llama: consistent ctx <-> buf order for KV cache (#16746) hai 2 meses
  Georgi Gerganov d00cbea63c server : host-memory prompt caching (#16391) hai 3 meses
  Johannes Gäßler e789095502 llama: print memory breakdown on exit (#15860) hai 3 meses
  Georgi Gerganov cf0e3ba150 model : avoid ggml_cont_3d for fused QKV weights (#15662) hai 4 meses
  Georgi Gerganov c610b6c11b kv-cache : fix SWA checks + disable cacheless iSWA (#15811) hai 4 meses
  Daniel Bevenius fb15d649ed llama : add support for EmbeddingGemma 300m (#15798) hai 4 meses
  Georgi Gerganov c8d0d14e77 kv-cache : fix find_slot to not search for continuous slot (#15638) hai 4 meses
  Georgi Gerganov 8a4280ce43 kv-cache : remove LLAMA_SET_ROWS checks (#15505) hai 4 meses
  Georgi Gerganov 1bded5a3b3 kv-cache : better estimate of n_kv for multi-sequence batches (#15610) hai 4 meses
  Georgi Gerganov b730706a49 kv-cache : support layer reuse (#15504) hai 5 meses
  Georgi Gerganov 9ebebef62f llama : remove KV cache defragmentation logic (#15473) hai 5 meses
  Georgi Gerganov 715a6db02c kv-cache : drop the "unified" prefix (#15467) hai 5 meses
  Georgi Gerganov 7f37b6cf1e memory : migrate from llama_kv_cache to more generic llama_memory (#14006) hai 7 meses
  Georgi Gerganov 0fc16b42e8 kv-cache : split implementation in separate sources (#13920) hai 7 meses
  Georgi Gerganov 3600cc2886 llama : use n_swa + n_ubatch cells for SWA cache (#13833) hai 7 meses
  Georgi Gerganov 3f55f781f1 llama : auto-batch preparation (#13845) hai 7 meses
  Georgi Gerganov 12d0188c0d kv-cache : refactor + add llama_memory_state_i (#13746) hai 7 meses
  Xuan-Son Nguyen 763d06edb7 llama : fix KV shift for qwen2vl (#13870) hai 7 meses
  Georgi Gerganov 81713121ee kv-cells : track min/max used cells and per-sequence positions (#13808) hai 7 meses
  Georgi Gerganov de2ef53a4b kv-cache : rework kv_cell (#13706) hai 8 meses
  Georgi Gerganov 797f2ac062 kv-cache : simplify the interface (#13660) hai 8 meses
  Georgi Gerganov a4090d1174 llama : remove llama_kv_cache_view API + remove deprecated (#13653) hai 8 meses
  Georgi Gerganov e298d2fbd0 kv-cache : add SWA support (#13194) hai 8 meses