提交历史

作者 SHA1 备注 提交日期
  RodriMora ce18efeaf1 convert : update transformers requirements (#16866) 2 月之前
  chansikpark 16724b5b68 server : bump request URI max length to 32768 (#16862) 2 月之前
  Georgi Gerganov b52edd2558 server : remove n_past (#16818) 2 月之前
  Max Krasnyansky 517b7170e1 cpu: introduce chunking for repack matmuls and enable matmul-id chunking on ARM64 (#16833) 2 月之前
  Shagun Bera 835e918d84 common: fix typo in cli help text (#16864) 2 月之前
  JJJYmmm d261223d24 model: add support for qwen3vl series (#16780) 2 月之前
  Max Krasnyansky dcca0d3ab8 cpu: introduce chunking for flash attention (#16829) 2 月之前
  Tianyue-Zhao bacddc049a model: Add support for CogVLM model (#15002) 2 月之前
  Sigbjørn Skjæret 229bf68628 cuda : fix argsort with 64k+ rows (#16849) 2 月之前
  Jan Boon d7395115ba llama : use std::abs instead of abs (#16853) 2 月之前
  Jeff Bolz 052df28b0e vulkan: Handle argsort with a large number of rows (#16851) 2 月之前
  Oliver Simons 8b11deea46 Hide latency of bias and gate-loading (#16847) 2 月之前
  Jeff Bolz b9ce940177 vulkan: Fuse rope+set_rows (#16769) 2 月之前
  Xuan-Son Nguyen 3464bdac37 llama: fix ASAN error with M-RoPE (#16848) 2 月之前
  Xuan-Son Nguyen e3af5563bd llama: store mrope data in KV cell (#16825) 2 月之前
  Jeff Bolz 10fcc41290 vulkan: Update topk_moe fusion to handle gpt's late softmax (#16656) 2 月之前
  Ruben Ortlam bcf5bda6f5 Vulkan MMQ Integer Dot Refactor and K-Quant support (#16536) 2 月之前
  Max Krasnyansky 3eb2be1ca5 Hexagon Op queue & dispatch optimizations (#16820) 2 月之前
  Aman Gupta e41bcce8f0 CUDA: use fastdiv in set-rows (#16834) 2 月之前
  Sigbjørn Skjæret 144a4ce824 vendor : sync minja (#16500) 2 月之前
  Jeff Bolz f549b0007d vulkan: Call ggml_vk_buffer_write_2d from ggml_vk_buffer_copy (#16793) 2 月之前
  Aman Gupta 9a3ea685b9 CUDA: Fix bug in topk-moe for gpt-oss (#16821) 2 月之前
  YaelLogic 338074c383 sycl: add RMS_NORM_BACK operation support (#16808) 2 月之前
  YaelGitAccount 851553ea6b cuda: add SET operation support (#16804) 2 月之前
  Georgi Gerganov 85a7d8677b memory : remove KV cache size padding (#16812) 2 月之前
  Georgi Gerganov a8ca18b4b8 llama-bench : clarify benchmarked parts of the computation (#16823) 2 月之前
  l3utterfly 8284efc35c initialise buffer.device in ggml_hexagon_session (#16816) 2 月之前
  Sam Malayek 1c1409e131 embedding: add raw option for --embd-output-format (#16541) 2 月之前
  Johannes Gäßler 7a0e900e36 llama: consistent ctx <-> buf order for KV cache (#16746) 2 月之前
  Aldehir Rojas 280d97be96 grammar : support array references in json schema (#16792) 2 月之前