Commit History

Autor SHA1 Mensaxe Data
  Johannes Gäßler fd08255d0d CUDA: non-contiguous (RMS) norm support (#11659) hai 11 meses
  Akarshan Biswas 6e84b0ab8e SYCL : SOFTMAX F16 mask support and other fixes (#11261) hai 11 meses
  Johannes Gäßler 8137b4bb2b CPU/CUDA: fix (GQA) mul mat back, add CUDA support (#11380) hai 11 meses
  Jeff Bolz 564804b79b tests: fix some mul_mat test gaps (#11375) hai 11 meses
  Jeff Bolz 44e18ef939 vulkan: fix coopmat2 flash attention for non-contiguous inputs (#11281) hai 1 ano
  Jeff Bolz bd38ddea01 vulkan: support copy from f32 to q4_0/q4_1/q5_0/q5_1/q8_0/iq4_nl (#11166) hai 1 ano
  Johannes Gäßler 9c8dcefe17 CUDA: backwards pass for misc. ops, add tests (#11257) hai 1 ano
  Johannes Gäßler 432df2d5f9 RoPE: fix back, CUDA support for back + noncont. (#11240) hai 1 ano
  Molly Sophia ee7136c6d1 llama: add support for QRWKV6 model architecture (#11001) hai 1 ano
  Jeff Bolz 716bd6dec3 vulkan: optimize mul_mat for small values of N (#10991) hai 1 ano
  Jeff Bolz a813badbbd vulkan: im2col and matmul optimizations for stable diffusion (#10942) hai 1 ano
  Georgi Gerganov 0006f5a74a ggml : update ggml_backend_cpu_device_supports_op (#10867) hai 1 ano
  HimariO ba1cb19cdd llama : add Qwen2VL support + multimodal RoPE (#10361) hai 1 ano
  PAB a8cbab201d ggml: add `GGML_SET` Metal kernel + i32 CPU kernel (ggml/1037) hai 1 ano
  PAB c2082d93a8 ggml : add `GGML_PAD_REFLECT_1D` operation (ggml/1034) hai 1 ano
  Jeff Bolz 2759916d86 vulkan: Implement "fast divide" (mul+shift) for unary ops like copy (#10642) hai 1 ano
  PAB efb6ae9630 feat: add `GGML_UNARY_OP_ARGMAX` Metal kernel (ggml/1019) hai 1 ano
  Georgi Gerganov 0115df2f65 metal : small-batch mat-mul kernels (#10581) hai 1 ano
  Georgi Gerganov f0678c5ff4 ggml : fix I8MM Q4_1 scaling factor conversion (#10562) hai 1 ano
  Jeff Bolz 904109ed0d vulkan: fix group_norm (#10496) hai 1 ano
  Diego Devesa 5931c1f233 ggml : add support for dynamic loading of backends (#10469) hai 1 ano
  Diego Devesa a5e47592b6 cuda : optimize argmax (#10441) hai 1 ano
  Johannes Gäßler 02e4eaf22f ggml-opt: fix data corruption (ggml/1022) hai 1 ano
  Jeff Bolz b3e585988f vulkan: Optimize soft_max (#10301) hai 1 ano
  Johannes Gäßler 8a43e940ab ggml: new optimization interface (ggml/988) hai 1 ano
  Jeff Bolz 80dd7ff22f vulkan: Optimize contiguous copies (#10254) hai 1 ano
  Georgi Gerganov 841f27abdb metal : optimize FA kernels (#10171) hai 1 ano
  Zhiyuan Li 3bcd40b3c5 Optimize RWKV6 Operator Naming and Implement Multi-core CPU/ SYCL Acceleration (#10133) hai 1 ano
  Georgi Gerganov 5c333e0140 metal : add BF16 support (#8439) hai 1 ano
  Diego Devesa 9f40989351 ggml : move CPU backend to a separate file (#10144) hai 1 ano