Commit History

Author SHA1 Message Date
  Sigbjørn Skjæret 169ee68ffb model : remove modern-bert iswa template (#18529) 4 weeks ago
  tt ced765be44 model: support youtu-vl model (#18479) 4 weeks ago
  o7si d0a6a31470 model : add support for JinaBertModel with non-gated ffn (#18475) 4 weeks ago
  HelloKS f4f5019254 model: add Solar Open model (#18511) 4 weeks ago
  momonga 9c675c7140 model : Plamo3 support (#17304) 1 month ago
  Johannes Gäßler 026d2ad472 llama: fix magic number of 999 for GPU layers (#18266) 1 month ago
  Xuan-Son Nguyen 4cbafad4f0 model: support MiMo-V2-Flash (#18328) 1 month ago
  Saba Fallah 54132f1b1f model : support for LlamaBidirectionalModel architecture (#18220) 1 month ago
  Alessandro98-git 96e33a814e model : fix div-by-zero for Nemotron V2 (#18309) 1 month ago
  Ryan Mangeno dfc959b886 model : Granite Embedding support (#15641) 1 month ago
  Johannes Gäßler 57c1e05643 llama: offload output layer to GPU first (#18148) 1 month ago
  Xuan-Son Nguyen ef83fb8601 model: fix LFM2 missing tensors (#18105) 1 month ago
  Xuan-Son Nguyen 3d86c6c2b5 model: support GLM4V vision encoder (#18042) 1 month ago
  Daniel Bevenius 2995341730 llama : add support for NVIDIA Nemotron 3 Nano (#18058) 1 month ago
  HelloKS 9d52f17ae3 model : add KORMo model (#18032) 1 month ago
  Johannes Gäßler b1f3a6e5db llama: automatically set parameters not set by the user in such a way that maximizes GPU utilization (#16653) 1 month ago
  Xuan-Son Nguyen 0759b09c90 graph: add f_attn_temp_offset (#18025) 1 month ago
  Georgi Gerganov 609a2d0268 models : fix YaRN regression + consolidate logic (#18006) 1 month ago
  Georgi Gerganov 7bed317f53 models : fix the attn_factor for mistral3 graphs + improve consistency (#17945) 1 month ago
  Eric Zhang b677721819 model : Qwen3-Next-80B-A3B has 48 layers (#17898) 1 month ago
  Sigbjørn Skjæret 42b12b5608 model : nit, DeepSeek V1 MoE is 16B and GigaChat is 20B (#12652) 1 month ago
  philip-essential 1d2a1ab73d model : support Rnj-1 (#17811) 1 month ago
  Xuan-Son Nguyen 4d3726278b model: add llama 4 scaling for mistral-large (deepseek arch) (#17744) 1 month ago
  Herman Semenoff 37adc9c6ba ggml, llama : use defaulted constructors/destructors (#17649) 1 month ago
  Piotr Wilkin (ilintar) 746f9ee889 Override SSM_A op for Qwen3 Next to reduce splits (#17587) 1 month ago
  Gilad S. 00c361fe53 fix: llama arch implementation (#17665) 1 month ago
  Xuan-Son Nguyen cd3c118908 model: support Ministral3 (#17644) 1 month ago
  Piotr Wilkin (ilintar) ff55414c42 model : Qwen3 Next (#16095) 2 months ago
  Georgi Gerganov 6783b11fb0 models : fix LFM2 tensors (#17548) 2 months ago
  Aaron Teo 877566d512 llama: introduce support for model-embedded sampling parameters (#17120) 2 months ago