Commit History

Autor SHA1 Mensaxe Data
  Sigbjørn Skjæret 3ecb2f671a ggml : implement set_rows with i32 index (#16159) hai 3 meses
  Shin-myoung-serp 96fdca043b Vulkan: add conv_transpose_2d operation (#16022) hai 3 meses
  Ruben Ortlam 9073a73d82 vulkan: vec dot matrix multiplication fix (#16151) hai 3 meses
  Xuan-Son Nguyen 0dd58b6877 ggml : refactor forward_dup for cpu backend (#16062) hai 4 meses
  Bowen Han 38dbdf4c05 CUDA: Optimize PAD_REFLECT_1D (#15957) hai 4 meses
  Reese Levine d304f459d8 GGML WebGPU: Support for ADD, MUL, RMS_NORM, GET_ROWS operators (#16018) hai 4 meses
  Georgi Gerganov 0320ac5264 metal : refactor + optimize v2 (#15995) hai 4 meses
  Oliver Simons 00681dfc16 CUDA: Add `fastdiv` to `k_bin_bcast*`, giving 1-3% E2E performance (#15872) hai 4 meses
  Daniel Bevenius e7b6d83b52 tests : filter out no-ops from coverage report (#15900) hai 4 meses
  Jeff Bolz 4f63cd705c vulkan: Fix OOB accesses in soft_max_back (#15861) hai 4 meses
  Aman Gupta a972faebed CUDA: Add mul_mat_id support for the mmf kernel (#15767) hai 4 meses
  Georgi Gerganov f28d4f4ac9 metal : refactor + optimize (#15857) hai 4 meses
  Xuan-Son Nguyen 9fcb29f22f ggml: allow casting between f32 and i32 (#15783) hai 4 meses
  Jeff Bolz d413dca003 tests: large sizes for get_rows (#15687) hai 4 meses
  Jeff Bolz 3976dfbe00 vulkan: support im2col_3d (#15795) hai 4 meses
  Jeff Bolz c97b5e5854 vulkan: Support pad_ext (#15794) hai 4 meses
  Daniel Bevenius 3a550b5ca4 tests : add --list-ops and --show-coverage options (#15745) hai 4 meses
  leejet 0a1b3982cd ggml: add ops for WAN video model (cuda && cpu) (#15669) hai 4 meses
  rmatif 86076f92de OpenCL: add fused group_norm/norm, mul, add (#15314) hai 4 meses
  Eve 44b1efa41a tests: add performance test for mul mat id (#15543) hai 4 meses
  Georgi Gerganov 1d8d83deaa metal : improve `MUL_MAT_ID` (#15541) hai 4 meses
  Jeff Bolz 34bdbbd7c2 vulkan: Remove splitting for mul_mat_id (#15568) hai 4 meses
  Jeff Bolz 886b97a5d6 tests: Generate unique input values for count_equal (#15487) hai 4 meses
  Jeff Bolz c9a24fb932 vulkan: Support FA with any multiple of 8 head sizes (#15537) hai 4 meses
  Jeff Bolz 611f419cff vulkan: optimize rms_norm, and allow the work to spread across multiple SMs (#15281) hai 4 meses
  Acly 0a9b43e507 vulkan : support ggml_mean (#15393) hai 4 meses
  rmatif 92f7f0a53c ggml: add `conv3d` op (#15182) hai 4 meses
  Jeff Bolz 96452a3fa4 vulkan: Reuse conversion results in prealloc_y (#15410) hai 5 meses
  Jeff Bolz de5627910d vulkan: Optimize argsort (#15354) hai 5 meses
  Jeff Bolz 1fe00296f5 vulkan: fuse adds (#15252) hai 5 meses