Histórico de Commits

Autor SHA1 Mensagem Data
  Xuan-Son Nguyen cd3c118908 model: support Ministral3 (#17644) há 1 mês atrás
  Aman Gupta 6eea666912 llama-graph: avoid expand_forward for fusion (#17633) há 1 mês atrás
  Georgi Gerganov 583cb83416 ggml : add ggml_top_k (#17365) há 1 mês atrás
  Aman Gupta a90eb94ca9 CUDA: fuse rope + set_rows (#16884) há 2 meses atrás
  Sigbjørn Skjæret 9008027aa3 hparams : add n_embd_inp() to support extended embed (#16928) há 2 meses atrás
  Jan Boon d7395115ba llama : use std::abs instead of abs (#16853) há 2 meses atrás
  Sigbjørn Skjæret f696428ce8 graph : add clamping to ffn_moe_weights_sum to avoid div-by-zero (#16655) há 2 meses atrás
  Aman Gupta f77c13b91f CUDA: General GEMV fusion (#16715) há 2 meses atrás
  Sigbjørn Skjæret 84bf3c6778 model : add BailingMoeV2 support (#16063) há 3 meses atrás
  Georgi Gerganov e60f241eac metal : FA support F32 K and V and head size = 32 (#16531) há 3 meses atrás
  Georgi Gerganov e38b7c6e9e graph : support cacheless embeddings with FA and iSWA (#16528) há 3 meses atrás
  Saba Fallah e08db42595 model: EmbeddingGemma Adding Support for SentenceTransformers Dense Modules (#16367) há 3 meses atrás
  Sigbjørn Skjæret 835b2b915c model : add GroveMoE support (#15510) há 3 meses atrás
  Aman Gupta 077c94d0ca CUDA: add a fused top-K MoE kernel (#16130) há 3 meses atrás
  Douglas Hanley b5bd037832 llama : add support for qwen3 reranker (#15824) há 3 meses atrás
  Sigbjørn Skjæret b8e09f08b9 model : add grok-2 support (#15539) há 4 meses atrás
  Sigbjørn Skjæret 6ab397e12b graph : support non-contiguous Q in build_attn_mha (#15908) há 4 meses atrás
  Georgi Gerganov 663027fd54 context : fix n_outputs during reserve (#15858) há 4 meses atrás
  Georgi Gerganov c610b6c11b kv-cache : fix SWA checks + disable cacheless iSWA (#15811) há 4 meses atrás
  Daniel Bevenius fb15d649ed llama : add support for EmbeddingGemma 300m (#15798) há 4 meses atrás
  Johannes Gäßler e81b8e4b7f llama: use FA + max. GPU layers by default (#15434) há 4 meses atrás
  Georgi Gerganov 8a4280ce43 kv-cache : remove LLAMA_SET_ROWS checks (#15505) há 4 meses atrás
  Georgi Gerganov 0373486dbc graph : fix assert in memory-less build_attn (#15590) há 4 meses atrás
  Georgi Gerganov 3f196be84b graph : remove build_attn_with_sinks overload (#15469) há 5 meses atrás
  Georgi Gerganov 715a6db02c kv-cache : drop the "unified" prefix (#15467) há 5 meses atrás
  Georgi Gerganov fd1234cb46 llama : add gpt-oss (#15091) há 5 meses atrás
  Sam ef0144c087 model: support GLM 4.5 family of models (#14939) há 5 meses atrás
  Dongliang Wei c1dacaa99b llama : merge build_moe_ffn_from_probs function into build_moe_ffn (#14968) há 5 meses atrás
  compilade 66625a59a5 graph : reduce splits for recurrent and hybrid models (#14825) há 5 meses atrás
  Douglas Hanley a118d80233 embeddings: fix extraction of CLS pooling results (#14927) há 5 meses atrás