Commit History

Autor SHA1 Mensaxe Data
  Georgi Gerganov b44890df2e model : disable SWA for Phi models (#13676) hai 8 meses
  R0CKSTAR 33983057d0 musa: Upgrade MUSA SDK version to rc4.0.1 and use mudnn::Unary::IDENTITY op to accelerate D2D memory copy (#13647) hai 8 meses
  Eve fb1cab201c vulkan: fix warnings (#13626) hai 8 meses
  l3utterfly b7a17463ec mtmd-helper : bug fix to token batching in mtmd (#13650) hai 8 meses
  Georgi Gerganov be0239693c model : fix llama4 graph (#13663) hai 8 meses
  Georgi Gerganov a4090d1174 llama : remove llama_kv_cache_view API + remove deprecated (#13653) hai 8 meses
  Johannes Gäßler b69f1647f9 CUDA: skip fully masked-out KV in FA vec kernel (#13584) hai 8 meses
  Sigbjørn Skjæret 759e37b0d8 tests : avoid github urls due to throttling (#13654) hai 8 meses
  Svetlozar Georgiev 4245e622e0 sycl: disable reorder for sycl mulmat (#13536) hai 8 meses
  0cc4m c9c64dee57 Set GLM4 blk.*.attn_output.weight, kqv_out-* matmul to GGML_PREC_F32 to fix infinity values in output (#13639) hai 8 meses
  Georgi Gerganov c00a2634be metal : fix typo in FA kernel comments (#13651) hai 8 meses
  Georgi Gerganov e298d2fbd0 kv-cache : add SWA support (#13194) hai 8 meses
  Xinpeng Dou f0adb80bf7 CANN: Update CANN model support (#13162) hai 8 meses
  Nicolò Scipione f7c9429c85 sycl : Overcoming workaround for mmap() allocation on Windows (#13482) hai 8 meses
  psocolovsky 1dfbf2cf3a common : add load_progress_callback (#13617) hai 8 meses
  0cc4m 8960efd0a6 Vulkan: Add f32 accumulator support to quantized mul mat to fix GLM4 32B incoherence (#13607) hai 8 meses
  Alberto Cabrera Pérez 725f23f1f3 sycl : backend documentation review (#13544) hai 8 meses
  Xuan-Son Nguyen 92ecdcc06a mtmd : add vision support for llama 4 (#13282) hai 8 meses
  Alberto Cabrera Pérez f71f40a284 ci : upgraded oneAPI version in SYCL workflows and dockerfile (#13532) hai 8 meses
  Georgi Gerganov d30cb5a7fa sync : ggml hai 8 meses
  Johannes Gäßler 6c35981a64 mnist: fix segmentation fault (ggml/1227) hai 8 meses
  Diego Devesa 8b5e19aea6 ggml : fix apple OS check in ggml_print_backtrace (ggml/1229) hai 8 meses
  Daniel Tang 60aea028b5 ggml : Fix missing backtrace on Linux (ggml/1228) hai 8 meses
  Nick 9c55e5c5c2 fix: check model pointer validity before use (#13631) hai 8 meses
  Chenguang Li 33d7aed4a8 CANN: Support MOE Model MUL_MAT_ID (#13042) hai 8 meses
  Isaac McFadyen 6a2bc8bfb7 server : added --no-prefill-assistant flag (#13608) hai 8 meses
  Gilad S. e3a7cf6c5b cmake: use the current build config for vulkan-shaders-gen (#13595) hai 8 meses
  Georgi Gerganov 518329b2d4 parallel : add option for non-shared and larger prompts (#13598) hai 8 meses
  Jeff Bolz 2f5a4e1e09 vulkan: move common FA code to flash_attn_base.comp (#13556) hai 8 meses
  Jeff Bolz 4f41ee11d6 vulkan: use scalar FA rather than coopmat2 when N==1 (#13554) hai 8 meses