Commit History

Autor SHA1 Mensaxe Data
  Sigbjørn Skjæret c8554b66e0 graph : use fill instead of scale_bias in grouped expert selection (#17867) hai 1 mes
  Daniel Bevenius 2fa51c19b0 model-conversion : add token ids to prompt token output [no ci] (#17863) hai 1 mes
  Xuan-Son Nguyen 951520ddb0 server: delegate result_state creation to server_task (#17835) hai 1 mes
  Neo Zhang 68522c678d ci : support bfloat16 SYCL release package (#17855) hai 1 mes
  Xuan-Son Nguyen f896d2c34f server: improve speed of speculative decoding (#17808) hai 1 mes
  Piotr Wilkin (ilintar) e4e9c4329c Make graph_max_nodes vary by ubatch size (#17794) hai 1 mes
  hksdpc255 636fc17a37 Fix Kimi-K2 tool-call parsing issues (#17376) hai 1 mes
  Jay Zenith 51e0c2d917 cuda : add FILL op support (#17851) hai 1 mes
  Xuan-Son Nguyen 37a4f63244 server : add development documentation (#17760) hai 1 mes
  Georgi Gerganov 2bc96931d2 server : make cache_reuse configurable per request (#17858) hai 1 mes
  wsbagnsv1 5814b4dce1 cuda: optimize SOLVE_TRI using registers and FMAF (#17703) hai 1 mes
  ixgbe 79d61896d3 ggml-cpu: add ggml_thread_cpu_relax with Zihintpause support (#17784) hai 1 mes
  Xuan-Son Nguyen 4d3726278b model: add llama 4 scaling for mistral-large (deepseek arch) (#17744) hai 1 mes
  lovedheart 08f9d3cc1d Vulkan: improve mul_mat_vec_iq1_m (#16907) hai 1 mes
  Sigbjørn Skjæret 0a540f9abd ci : add windows-cuda 13.1 release (#17839) hai 1 mes
  Sigbjørn Skjæret 22577583a3 common : change --color to accept on/off/auto, default to auto (#17827) hai 1 mes
  Law Po Ying d9e03db1e7 sycl: add missing BF16 conversion support for Intel oneAPI (#17780) hai 1 mes
  Jeff Bolz db97837385 vulkan: perf_logger improvements (#17672) hai 1 mes
  Vishal Singh 017761daf5 ggml-zendnn : add ZenDNN backend for AMD CPUs (#17690) hai 1 mes
  Xuan-Son Nguyen c42712b056 server: support multiple generations from one prompt (OAI "n" option) (#17775) hai 1 mes
  Phylliida Dev 09c7c50e64 ggml : add circular tiling support to pad, for Vulkan, CUDA, and CPU (used for making seamless textures) (#16985) hai 1 mes
  Johannes Gäßler f334b79494 HIP: fix RDNA3 FP16/BF16 matrix multiplication (#17817) hai 1 mes
  Aleksander Grygier a28e3c7567 webui: Stop generation from chat sidebar (#17806) hai 1 mes
  Aleksander Grygier e31b5c55c3 webui: Fix context available value in Multi-model Router mode (#17804) hai 1 mes
  Aleksander Grygier 21f24f27a9 webui: Per-conversation system message with UI displaying, edition & branching (#17275) hai 1 mes
  Sky 7b43f55753 ggml : improve error handling for search path existence checks (#17653) hai 1 mes
  Daniel Bevenius 444f00b0ec llama : remove quantization sanity check (#17788) hai 1 mes
  Jeff Bolz 2960eb2975 vulkan: Use one row per workgroup for f32 mmv (#17711) hai 1 mes
  Xuan-Son Nguyen dbc15a7967 convert: support Mistral 3 Large MoE (#17730) hai 1 mes
  Jeff Bolz c6c5e85979 vulkan: support solve_tri with larger N/K values (#17781) hai 1 mes