تاریخچه Commit ها

نویسنده SHA1 پیام تاریخ
  Jeff Bolz eddfb43850 vulkan: Optimize mul_mat_vec p021 and nc shaders (#12505) 10 ماه پیش
  Gaurav Garg 517b5ddbf0 CUDA: Improve flash decoding kernel GPU occupancy for BS=1 case (#12183) 10 ماه پیش
  Molly Sophia 7dfad387e3 llama: Add support for RWKV v7 architecture (#12412) 10 ماه پیش
  Jeff Bolz bf69cfe62f vulkan: fix bug in coopmat1 mul_mat_id (#12316) 10 ماه پیش
  cmdr2 0cbee131ad cuda/vulkan: specify fp32-only support for some operations in supports_op (ggml/1129) 10 ماه پیش
  cmdr2 87abb7e903 cuda/cpu: Increase support for fp16 unary operations (ggml/1125) 10 ماه پیش
  cmdr2 f54a4ba11e Support pure float16 add/sub/mul/div operations in the CUDA (and CPU) backend (ggml/1121) 10 ماه پیش
  Diego Devesa d5c63cd7f9 test-backend-ops : add option -p to filter by op params (#12155) 10 ماه پیش
  William Tambellini 70680c48e5 ggml : upgrade init_tensor API to return a ggml_status (#11854) 10 ماه پیش
  Johannes Gäßler 5fa07c2f93 CUDA: optimize FA for GQA + large batches (#12014) 10 ماه پیش
  Rémy O 2eea03d86a vulkan: implement several ops relevant for ggml_opt (#11769) 11 ماه پیش
  Johannes Gäßler fd08255d0d CUDA: non-contiguous (RMS) norm support (#11659) 11 ماه پیش
  Akarshan Biswas 6e84b0ab8e SYCL : SOFTMAX F16 mask support and other fixes (#11261) 11 ماه پیش
  Johannes Gäßler 8137b4bb2b CPU/CUDA: fix (GQA) mul mat back, add CUDA support (#11380) 11 ماه پیش
  Jeff Bolz 564804b79b tests: fix some mul_mat test gaps (#11375) 11 ماه پیش
  Jeff Bolz 44e18ef939 vulkan: fix coopmat2 flash attention for non-contiguous inputs (#11281) 1 سال پیش
  Jeff Bolz bd38ddea01 vulkan: support copy from f32 to q4_0/q4_1/q5_0/q5_1/q8_0/iq4_nl (#11166) 1 سال پیش
  Johannes Gäßler 9c8dcefe17 CUDA: backwards pass for misc. ops, add tests (#11257) 1 سال پیش
  Johannes Gäßler 432df2d5f9 RoPE: fix back, CUDA support for back + noncont. (#11240) 1 سال پیش
  Molly Sophia ee7136c6d1 llama: add support for QRWKV6 model architecture (#11001) 1 سال پیش
  Jeff Bolz 716bd6dec3 vulkan: optimize mul_mat for small values of N (#10991) 1 سال پیش
  Jeff Bolz a813badbbd vulkan: im2col and matmul optimizations for stable diffusion (#10942) 1 سال پیش
  Georgi Gerganov 0006f5a74a ggml : update ggml_backend_cpu_device_supports_op (#10867) 1 سال پیش
  HimariO ba1cb19cdd llama : add Qwen2VL support + multimodal RoPE (#10361) 1 سال پیش
  PAB a8cbab201d ggml: add `GGML_SET` Metal kernel + i32 CPU kernel (ggml/1037) 1 سال پیش
  PAB c2082d93a8 ggml : add `GGML_PAD_REFLECT_1D` operation (ggml/1034) 1 سال پیش
  Jeff Bolz 2759916d86 vulkan: Implement "fast divide" (mul+shift) for unary ops like copy (#10642) 1 سال پیش
  PAB efb6ae9630 feat: add `GGML_UNARY_OP_ARGMAX` Metal kernel (ggml/1019) 1 سال پیش
  Georgi Gerganov 0115df2f65 metal : small-batch mat-mul kernels (#10581) 1 سال پیش
  Georgi Gerganov f0678c5ff4 ggml : fix I8MM Q4_1 scaling factor conversion (#10562) 1 سال پیش