Jeff Bolz
|
b4e335d8dc
vulkan: fuse rms_norm + mul + rope (+ view + set_rows) (#16977)
|
2 months ago |
Jeff Bolz
|
d6fe40fa00
vulkan: Fix test-thread-safety crashes (#17024)
|
2 months ago |
Acly
|
ac76d36201
vulkan : refactor buffer handling in vk_op_f32 (#16840)
|
2 months ago |
Jeff Bolz
|
a44d77126c
vulkan: Fix GGML_VULKAN_CHECK_RESULTS to better handle fusion (#16919)
|
2 months ago |
Jeff Bolz
|
ad51c0a720
vulkan: remove the need for the dryrun (#16826)
|
2 months ago |
Jeff Bolz
|
5d8bb900bc
vulkan: Fix multi_add invalid descriptor usage (#16899)
|
2 months ago |
Jeff Bolz
|
2e76e01360
vulkan: fuse mul_mat+add and mul_mat_id+add_id (#16868)
|
2 months ago |
Masato Nakasaka
|
2976b0374d
vulkan: Fix crash when FP16 mul_mat accumulation is not supported (#16796)
|
2 months ago |
JJJYmmm
|
d261223d24
model: add support for qwen3vl series (#16780)
|
2 months ago |
Jeff Bolz
|
052df28b0e
vulkan: Handle argsort with a large number of rows (#16851)
|
2 months ago |
Jeff Bolz
|
b9ce940177
vulkan: Fuse rope+set_rows (#16769)
|
2 months ago |
Jeff Bolz
|
10fcc41290
vulkan: Update topk_moe fusion to handle gpt's late softmax (#16656)
|
2 months ago |
Ruben Ortlam
|
bcf5bda6f5
Vulkan MMQ Integer Dot Refactor and K-Quant support (#16536)
|
2 months ago |
Jeff Bolz
|
f549b0007d
vulkan: Call ggml_vk_buffer_write_2d from ggml_vk_buffer_copy (#16793)
|
2 months ago |
Acly
|
10640e31aa
ggml : fix interpolate with align-corners and ne=1 (#16700)
|
2 months ago |
Gilad S.
|
3cfa9c3f12
vulkan: deduplicate Microsoft Direct3D12 devices (#16689)
|
3 months ago |
Giuseppe Scrivano
|
f90b4a8efe
vulkan: delete dead code (#16732)
|
3 months ago |
Jeff Bolz
|
8423d01931
vulkan: Optimize SSM_SCAN (#16645)
|
3 months ago |
Jeff Bolz
|
e56abd2098
vulkan: Implement topk_moe fused shader, ported from CUDA (#16641)
|
3 months ago |
Giuseppe Scrivano
|
3d4e86bbeb
vulkan: Add State Space Model (SSM) Operations Support (#16463)
|
3 months ago |
Jeff Bolz
|
4258e0cfe7
vulkan: Support FA with K/V in F32 (#16543)
|
3 months ago |
Jeff Bolz
|
2aaf0a2a20
vulkan: Replace uses of maxMemoryAllocationSize and VK_WHOLE_SIZE (#16354)
|
3 months ago |
Jeff Bolz
|
e308efda8e
vulkan: in flash attention, bounds check against nem1 (don't rely on GGML_KQ_MASK_PAD) (#16316)
|
3 months ago |
Eve
|
132d673554
vulkan: make ggml_vk_default_dispatcher support older vulkan headers (#16345)
|
3 months ago |
Jeff Bolz
|
d8359f5fde
vulkan: 64-bit im2col (#16135)
|
3 months ago |
Jeff Bolz
|
1384abf8b8
vulkan: handle mat_mul with A matrix > 4GB (#16176)
|
3 months ago |
Acly
|
8656f5de68
vulkan : make the vulkan.hpp dynamic dispatcher instance private (#16224)
|
3 months ago |
Dmytro Minochkin
|
0499b29c6f
vulkan: throw system error instead of SIGABRT during init on older devices (#16156)
|
3 months ago |
Jeff Bolz
|
3f81b4e91c
vulkan: support GET_ROWS for k-quants (#16235)
|
3 months ago |
Sigbjørn Skjæret
|
3ecb2f671a
ggml : implement set_rows with i32 index (#16159)
|
4 months ago |