Komit Sejarah

Pembuat SHA1 Pesan Tanggal
  Giuseppe Scrivano e58d585604 model : add Granite Hybrid nano types (#16896) 2 bulan lalu
  Johannes Gäßler 31c511a968 CUDA: Volta tensor core support for MMF (#16843) 2 bulan lalu
  Georgi Gerganov 6d39015a74 sync : ggml 2 bulan lalu
  Aman Gupta 4146d6a1a6 CUDA: add expert reduce kernel (#16857) 2 bulan lalu
  Georgi Gerganov 8da3c0e200 batch : fix consistency checks for the input positions (#16890) 2 bulan lalu
  Georgi Gerganov c22473b580 server : don't print user inputs to console (#16871) 2 bulan lalu
  Daniel Bevenius 0f715b4e75 server : fix typos in server.cpp comments [no ci] (#16883) 2 bulan lalu
  Jeff Bolz d2d931f173 vulkan: disable spirv-opt for rope shaders (#16872) 2 bulan lalu
  Masato Nakasaka 2976b0374d vulkan: Fix crash when FP16 mul_mat accumulation is not supported (#16796) 2 bulan lalu
  Ruben Ortlam d2a2673dd1 vulkan: fix shmem overrun in mmq id shader (#16873) 2 bulan lalu
  l3utterfly 13002a0896 ggml-hexagon: respect input size when getting/setting tensor data (#16836) 2 bulan lalu
  Sigbjørn Skjæret 6eb208d17e ci : enable free-disk-space on cuda docker build (#16877) 2 bulan lalu
  lhez 9984cbb61d opencl: fix boundary handling for mul_mm (#16875) 2 bulan lalu
  RodriMora ce18efeaf1 convert : update transformers requirements (#16866) 2 bulan lalu
  chansikpark 16724b5b68 server : bump request URI max length to 32768 (#16862) 2 bulan lalu
  Georgi Gerganov b52edd2558 server : remove n_past (#16818) 2 bulan lalu
  Max Krasnyansky 517b7170e1 cpu: introduce chunking for repack matmuls and enable matmul-id chunking on ARM64 (#16833) 2 bulan lalu
  Shagun Bera 835e918d84 common: fix typo in cli help text (#16864) 2 bulan lalu
  JJJYmmm d261223d24 model: add support for qwen3vl series (#16780) 2 bulan lalu
  Max Krasnyansky dcca0d3ab8 cpu: introduce chunking for flash attention (#16829) 2 bulan lalu
  Tianyue-Zhao bacddc049a model: Add support for CogVLM model (#15002) 2 bulan lalu
  Sigbjørn Skjæret 229bf68628 cuda : fix argsort with 64k+ rows (#16849) 2 bulan lalu
  Jan Boon d7395115ba llama : use std::abs instead of abs (#16853) 2 bulan lalu
  Jeff Bolz 052df28b0e vulkan: Handle argsort with a large number of rows (#16851) 2 bulan lalu
  Oliver Simons 8b11deea46 Hide latency of bias and gate-loading (#16847) 2 bulan lalu
  Jeff Bolz b9ce940177 vulkan: Fuse rope+set_rows (#16769) 2 bulan lalu
  Xuan-Son Nguyen 3464bdac37 llama: fix ASAN error with M-RoPE (#16848) 2 bulan lalu
  Xuan-Son Nguyen e3af5563bd llama: store mrope data in KV cell (#16825) 2 bulan lalu
  Jeff Bolz 10fcc41290 vulkan: Update topk_moe fusion to handle gpt's late softmax (#16656) 2 bulan lalu
  Ruben Ortlam bcf5bda6f5 Vulkan MMQ Integer Dot Refactor and K-Quant support (#16536) 2 bulan lalu