Commit History

Author SHA1 Message Date
  Xuan-Son Nguyen af6ae1efb2 llama : fix non-causal mask for gemma 3 (#12615) 9 months ago
  Georgi Gerganov 75422e8bc4 graph : normalize Q, K, V shapes + sync cross attention (#12449) 10 months ago
  fairydreaming 8fcb563613 Load all MoE experts during warmup (#11571) 10 months ago
  Georgi Gerganov c522ce4143 graph : simplify attn input build for unified KV cache (#12381) 10 months ago
  Georgi Gerganov 081bee8c64 hparams : add SWA rope parameters (#12374) 10 months ago
  Georgi Gerganov 84d5475541 llama : fix Gemma3 SWA KV cache shift (#12373) 10 months ago
  Georgi Gerganov e0dbec0bc6 llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181) 10 months ago