Commit History

Author SHA1 Message Date
  Eve b3b6d862cf vulkan: matmul gcn tuning (#13016) 9 months ago
  Jeff Bolz 66168204be vulkan: support noncontiguous rms_norm (#13031) 9 months ago
  Georgi Gerganov 2f74c354c0 graph : make FA compatible with MLA + add initial Metal kernels (#12953) 9 months ago
  Jeff Bolz 015022bb53 vulkan: enable coopmat2 FA gqa and split_k optimizations more often (#12931) 9 months ago
  Diego Devesa fe92821ea9 ggml : add bilinear upscale support (ggml/1185) 9 months ago
  Jeff Bolz 0090950f67 vulkan: In coopmat2 mmq, load q4_k/q5_k scales through shared memory (#12833) 9 months ago
  Jeff Bolz 80b717d493 vulkan: Use unclamped loads for flash attention mask (#12720) 9 months ago
  0cc4m 6bf28f0111 Vulkan: Tune Vulkan mmq int dot shader for performance (#12767) 9 months ago
  Jeff Bolz 74d4f5b041 vulkan: Hybrid waitForFences/getFenceStatus to reduce fence latency (#12630) 10 months ago
  Jeff Bolz f01bd02376 vulkan: Implement split_k for coopmat2 flash attention. (#12627) 10 months ago
  Jeff Bolz be0a0f8cae vulkan: Implement grouped query attention in the coopmat2 FA shader (#12559) 10 months ago
  Wagner Bruna 2bb3597e42 vulkan: fix build when glslc doesn't support coopmat (#12683) 10 months ago
  0cc4m a8a1f33567 Vulkan: Add DP4A MMQ and Q8_1 quantization shader (#12135) 10 months ago
  Georgi Gerganov b4ae50810e metal : improve FA + improve MoE (#12612) 10 months ago
  Jeff Bolz eddfb43850 vulkan: Optimize mul_mat_vec p021 and nc shaders (#12505) 10 months ago
  stduhpf 4375415b4a Vulkan: RTE rounding for cpy to quant (#12480) 10 months ago
  Jeff Bolz c446b2edd2 vulkan: Submit once enough matmul work has been recorded (#12406) 10 months ago
  0cc4m fd123cfead Vulkan: Default to 1GB allocations instead of 4GB to avoid fragmentation and driver issues (#12434) 10 months ago
  Molly Sophia 7dfad387e3 llama: Add support for RWKV v7 architecture (#12412) 10 months ago
  Jeff Bolz 484a8ab513 vulkan: Add N/2 and N/4 optimized paths in coopmat2 shader (#12312) 10 months ago
  Daniele cf2270e4d3 vulkan: subgroup size tuning (#12087) 10 months ago
  Jeff Bolz 891c63956d vulkan: Pad N dimension of B matrix for coopmat2 perf, to avoid bounds checking (#12273) 10 months ago
  Jeff Bolz 2f21123c1d vulkan: Adjust coopmat2 tile sizes and selection heuristic (#12258) 10 months ago
  cmdr2 0cbee131ad cuda/vulkan: specify fp32-only support for some operations in supports_op (ggml/1129) 11 months ago
  William Tambellini 70680c48e5 ggml : upgrade init_tensor API to return a ggml_status (#11854) 11 months ago
  Rémy O 438a83926a vulkan: add specific MMV kernels for IQ2 and IQ3 quants + optimizations (#11595) 11 months ago
  Jeff Bolz a82c9e7c23 vulkan: fix assertion when qy_needs_dequant (#12068) 11 months ago
  Judd c132239bfb add OP sigmoid (#12056) 11 months ago
  Rémy O 61d4f39dfe vulkan: implement more backpropagation operators (#11914) 11 months ago
  Rémy O 2eea03d86a vulkan: implement several ops relevant for ggml_opt (#11769) 11 months ago