cturan/llama.cpp

Author	SHA1 Message	Date
Eve	b3b6d862cf vulkan: matmul gcn tuning (#13016)	9 months ago
Jeff Bolz	66168204be vulkan: support noncontiguous rms_norm (#13031)	9 months ago
Georgi Gerganov	2f74c354c0 graph : make FA compatible with MLA + add initial Metal kernels (#12953)	9 months ago
Jeff Bolz	015022bb53 vulkan: enable coopmat2 FA gqa and split_k optimizations more often (#12931)	9 months ago
Diego Devesa	fe92821ea9 ggml : add bilinear upscale support (ggml/1185)	9 months ago
Jeff Bolz	0090950f67 vulkan: In coopmat2 mmq, load q4_k/q5_k scales through shared memory (#12833)	9 months ago
Jeff Bolz	80b717d493 vulkan: Use unclamped loads for flash attention mask (#12720)	9 months ago
0cc4m	6bf28f0111 Vulkan: Tune Vulkan mmq int dot shader for performance (#12767)	9 months ago
Jeff Bolz	74d4f5b041 vulkan: Hybrid waitForFences/getFenceStatus to reduce fence latency (#12630)	10 months ago
Jeff Bolz	f01bd02376 vulkan: Implement split_k for coopmat2 flash attention. (#12627)	10 months ago
Jeff Bolz	be0a0f8cae vulkan: Implement grouped query attention in the coopmat2 FA shader (#12559)	10 months ago
Wagner Bruna	2bb3597e42 vulkan: fix build when glslc doesn't support coopmat (#12683)	10 months ago
0cc4m	a8a1f33567 Vulkan: Add DP4A MMQ and Q8_1 quantization shader (#12135)	10 months ago
Georgi Gerganov	b4ae50810e metal : improve FA + improve MoE (#12612)	10 months ago
Jeff Bolz	eddfb43850 vulkan: Optimize mul_mat_vec p021 and nc shaders (#12505)	10 months ago
stduhpf	4375415b4a Vulkan: RTE rounding for cpy to quant (#12480)	10 months ago
Jeff Bolz	c446b2edd2 vulkan: Submit once enough matmul work has been recorded (#12406)	10 months ago
0cc4m	fd123cfead Vulkan: Default to 1GB allocations instead of 4GB to avoid fragmentation and driver issues (#12434)	10 months ago
Molly Sophia	7dfad387e3 llama: Add support for RWKV v7 architecture (#12412)	10 months ago
Jeff Bolz	484a8ab513 vulkan: Add N/2 and N/4 optimized paths in coopmat2 shader (#12312)	10 months ago
Daniele	cf2270e4d3 vulkan: subgroup size tuning (#12087)	10 months ago
Jeff Bolz	891c63956d vulkan: Pad N dimension of B matrix for coopmat2 perf, to avoid bounds checking (#12273)	10 months ago
Jeff Bolz	2f21123c1d vulkan: Adjust coopmat2 tile sizes and selection heuristic (#12258)	10 months ago
cmdr2	0cbee131ad cuda/vulkan: specify fp32-only support for some operations in supports_op (ggml/1129)	11 months ago
William Tambellini	70680c48e5 ggml : upgrade init_tensor API to return a ggml_status (#11854)	11 months ago
Rémy O	438a83926a vulkan: add specific MMV kernels for IQ2 and IQ3 quants + optimizations (#11595)	11 months ago
Jeff Bolz	a82c9e7c23 vulkan: fix assertion when qy_needs_dequant (#12068)	11 months ago
Judd	c132239bfb add OP sigmoid (#12056)	11 months ago
Rémy O	61d4f39dfe vulkan: implement more backpropagation operators (#11914)	11 months ago
Rémy O	2eea03d86a vulkan: implement several ops relevant for ggml_opt (#11769)	11 months ago

Newer Older

Commit History Find

Commit History