Commit History

Autor SHA1 Mensaxe Data
  Tianyue-Zhao bacddc049a model: Add support for CogVLM model (#15002) hai 3 meses
  Georgi Gerganov 85a7d8677b memory : remove KV cache size padding (#16812) hai 3 meses
  Johannes Gäßler 7a0e900e36 llama: consistent ctx <-> buf order for KV cache (#16746) hai 3 meses
  Johannes Gäßler 945501f5ea llama: fix leaked buffers for mmap + split files (#16765) hai 3 meses
  Sigbjørn Skjæret 73a48c9790 convert : enable expert group selection for all models with it (#16691) hai 3 meses
  Sigbjørn Skjæret 7cce4f8158 model : set res->t_embd in SmallThinker models (#16782) hai 3 meses
  Shunta Saito 226f295f4d model : set res->t_embd in PLaMo2 models (#16766) hai 3 meses
  Max Krasnyansky 63d2fc46e1 Add experimental ggml-hexagon backend for the Hexagon NPU (#16547) hai 3 meses
  Sigbjørn Skjæret 84bf3c6778 model : add BailingMoeV2 support (#16063) hai 3 meses
  Giuseppe Scrivano 0398752dd4 model : add Granite Hybrid types (#16635) hai 3 meses
  Johannes Gäßler 66b0dbcb2d llama-model: fix insonsistent ctxs <-> bufs order (#16581) hai 3 meses
  Xuan-Son Nguyen 3e3cb19f64 llama-quant: add support for mmproj (#16592) hai 3 meses
  Georgi Gerganov e38b7c6e9e graph : support cacheless embeddings with FA and iSWA (#16528) hai 3 meses
  Georgi Gerganov a3cb04744f metal : fix mul-mm condition + fix mul-mv permuted kernels (#16494) hai 3 meses
  Saba Fallah e08db42595 model: EmbeddingGemma Adding Support for SentenceTransformers Dense Modules (#16367) hai 3 meses
  Tarek Dakhran aeaf8a36f0 llama : support LiquidAI LFM2-MoE hybrid model (#16464) hai 3 meses
  Gadflyii 3df2244df4 llama : add --no-host to disable host buffers (#16310) hai 3 meses
  ddh0 f6dcda3900 server : context checkpointing for hybrid and recurrent models (#16382) hai 3 meses
  Sigbjørn Skjæret 946f71ed9a llama : fix shapes for bert/mpt q/k norm (#16409) hai 3 meses
  Piotr Wilkin (ilintar) 34fcc5a4ac model : Apertus model implementation (#15852) hai 3 meses
  Shunta Saito ded67b9444 llama : parameter conversion and loading fixes for PLaMo2 variants (#16075) hai 3 meses
  Bartowski e74c92e842 model : support GLM 4.6 (make a few NextN/MTP tensors not required) (#16359) hai 4 meses
  anavp-nvidia a014310374 cuda : Enable CUDA Graph usage for Nemotron Nano v2 (NemotronH) (#16328) hai 4 meses
  Vinkal 72b24d96c6 model : make minicpm embedding_scale, residual_scale and logit_scale optional with legacy defaults (#16273) hai 4 meses
  Sigbjørn Skjæret 835b2b915c model : add GroveMoE support (#15510) hai 4 meses
  Douglas Hanley b5bd037832 llama : add support for qwen3 reranker (#15824) hai 4 meses
  Johannes Gäßler e789095502 llama: print memory breakdown on exit (#15860) hai 4 meses
  Tarek Dakhran 3a59971967 model : add label for LiquidAI LFM2-2.6B model (#16204) hai 4 meses
  Xuan-Son Nguyen 8f8f2274ee convert : add Llama4ForCausalLM (#16042) hai 4 meses
  Shane A 85286f3548 model : add OLMo3 support (#16015) hai 4 meses