Commit History

Auteur SHA1 Bericht Datum
  safranowith 2330de7b84 SYCL: Add support for FLOOR,CEIL,ROUND and TRUNC unary operators (#16613) 2 maanden geleden
  Ilia Ilmer 9ad4f1931e metal : add `CONV_TRANSPOSE_2D` (#16542) 3 maanden geleden
  lhez 0cb7a0683b opencl: add q8_0 mm support (#16469) 3 maanden geleden
  Sam/Samuel f4ce81c45e metal: optimise `GGML_OP_SUM` (#16559) 3 maanden geleden
  Aman Gupta 48e2fa9fb7 CUDA: add fp kernel for larger batch size MoE (#16512) 3 maanden geleden
  Georgi Gerganov e60f241eac metal : FA support F32 K and V and head size = 32 (#16531) 3 maanden geleden
  Georgi Gerganov 0a319bb75e metal : add support for non-padded FA KV (#16148) 3 maanden geleden
  Georgi Gerganov 1d6092fc72 tests : add -INF blocks to the KQ mask in the FA tests (#16380) 3 maanden geleden
  Reese Levine ef07a40906 ggml webgpu: add support for soft_max, optimize rms_norm (#16357) 3 maanden geleden
  Reese Levine 8d78cd2613 ggml webgpu: support for rope,div,sub,glu,scale,cont operators (#16187) 3 maanden geleden
  Jeff Bolz a74a0d69f3 tests: override test_set_rows::max_nmse_err to allow for occasional rounding differences (#16295) 3 maanden geleden
  Sigbjørn Skjæret adc76347d7 ggml : check cuda and metal argsort limits and add test (#16323) 3 maanden geleden
  Sigbjørn Skjæret b887d2f341 ggml : fix GGML_F32_VEC_FMA argument order in ggml_vec_mad1_f32 (#16307) 3 maanden geleden
  Jeff Bolz d8359f5fde vulkan: 64-bit im2col (#16135) 3 maanden geleden
  Georgi Gerganov 6a2c6145a0 metal : extend mat-mat multiplication support (#16225) 3 maanden geleden
  Jeff Bolz 1384abf8b8 vulkan: handle mat_mul with A matrix > 4GB (#16176) 3 maanden geleden
  Aman Gupta c0bfc57af4 CUDA: mul_mat_id for mmf for bs <= 64 for f16 and bs <= 32 for f32 (#16277) 3 maanden geleden
  Aman Gupta 077c94d0ca CUDA: add a fused top-K MoE kernel (#16130) 3 maanden geleden
  Georgi Gerganov dfcd53f7ec metal : fuse NORM + MUL + ADD, support non-multiples of 4 (#16220) 3 maanden geleden
  Sigbjørn Skjæret 3ecb2f671a ggml : implement set_rows with i32 index (#16159) 3 maanden geleden
  Shin-myoung-serp 96fdca043b Vulkan: add conv_transpose_2d operation (#16022) 3 maanden geleden
  Ruben Ortlam 9073a73d82 vulkan: vec dot matrix multiplication fix (#16151) 3 maanden geleden
  Xuan-Son Nguyen 0dd58b6877 ggml : refactor forward_dup for cpu backend (#16062) 4 maanden geleden
  Bowen Han 38dbdf4c05 CUDA: Optimize PAD_REFLECT_1D (#15957) 4 maanden geleden
  Reese Levine d304f459d8 GGML WebGPU: Support for ADD, MUL, RMS_NORM, GET_ROWS operators (#16018) 4 maanden geleden
  Georgi Gerganov 0320ac5264 metal : refactor + optimize v2 (#15995) 4 maanden geleden
  Oliver Simons 00681dfc16 CUDA: Add `fastdiv` to `k_bin_bcast*`, giving 1-3% E2E performance (#15872) 4 maanden geleden
  Daniel Bevenius e7b6d83b52 tests : filter out no-ops from coverage report (#15900) 4 maanden geleden
  Jeff Bolz 4f63cd705c vulkan: Fix OOB accesses in soft_max_back (#15861) 4 maanden geleden
  Aman Gupta a972faebed CUDA: Add mul_mat_id support for the mmf kernel (#15767) 4 maanden geleden