Giuseppe Scrivano
|
7d77f07325
vulkan: implement ADD1, ARANGE, FILL, SOFTPLUS, STEP, ROUND, CEIL, FLOOR, TRUNC (#17319)
|
1 month ago |
Jeff Bolz
|
1fa4551af0
vulkan: support larger argsort (#17313)
|
1 month ago |
Jeff Bolz
|
2eba631b81
vulkan: Add copy_transpose shader (#17371)
|
1 month ago |
Ruben Ortlam
|
980b7cd17e
vulkan: force full subgroups for flash attention to fix intel subgroup crash (#17356)
|
1 month ago |
Jeff Bolz
|
da95bf2a85
vulkan: support noncontig i32 copy (#17328)
|
2 months ago |
Ruben Ortlam
|
38e2c1b412
vulkan: add log RTE support to fix Nvidia CI (#17320)
|
2 months ago |
Pavels Zaicenkovs
|
dbed61294a
vulkan: add LOG operation support for F32 and F16 (#17183)
|
2 months ago |
Ruben Ortlam
|
80deff3648
vulkan: fix MMQ quantize_y condition (#17301)
|
2 months ago |
Jeff Bolz
|
24dc769f1b
vulkan: Fuse mul_mat_id+add_id+mul and mul_mat+add+add. (#17287)
|
2 months ago |
Giuseppe Scrivano
|
1568d13c2c
vulkan: implement ABS and NEG (#17245)
|
2 months ago |
Jeff Bolz
|
439342ea0b
vulkan: Use ggml_vk_tensor_subbuffer in mul_mat_vec(id) paths (#17244)
|
2 months ago |
Jeff Bolz
|
234ae7d7bd
vulkan: skip all-negative-inf blocks in FA (#17186)
|
2 months ago |
Jeff Bolz
|
38eaf32af1
vulkan: change graph_compute to be async and enable get_tensor_async (#17158)
|
2 months ago |
Eve
|
7d019cff74
disable rms norm mul rope for chips with no fp16 rte (#17134)
|
2 months ago |
Ruben Ortlam
|
85234a4b3a
vulkan: fix validation issue introduced by #16868 (#17145)
|
2 months ago |
Acly
|
1032256ec9
cuda/vulkan : bicubic interpolation (#17022)
|
2 months ago |
Ruben Ortlam
|
392e09a608
vulkan: fix memory allocations (#17122)
|
2 months ago |
Ruben Ortlam
|
7f3e9d339c
vulkan: iGPU memory reporting fix (#17110)
|
2 months ago |
Ruben Ortlam
|
8a3519b708
vulkan: fix mmq out of bounds reads (#17108)
|
2 months ago |
Jeff Bolz
|
80a6cf6347
vulkan: fuse mul_mat_id + mul (#17095)
|
2 months ago |
Jeff Bolz
|
53d7d21e61
vulkan: Use spec constants for conv2d s/d/p and kernel W/H (#16978)
|
2 months ago |
Jeff Bolz
|
b4e335d8dc
vulkan: fuse rms_norm + mul + rope (+ view + set_rows) (#16977)
|
2 months ago |
Jeff Bolz
|
d6fe40fa00
vulkan: Fix test-thread-safety crashes (#17024)
|
2 months ago |
Acly
|
ac76d36201
vulkan : refactor buffer handling in vk_op_f32 (#16840)
|
2 months ago |
Jeff Bolz
|
a44d77126c
vulkan: Fix GGML_VULKAN_CHECK_RESULTS to better handle fusion (#16919)
|
2 months ago |
Jeff Bolz
|
ad51c0a720
vulkan: remove the need for the dryrun (#16826)
|
2 months ago |
Jeff Bolz
|
5d8bb900bc
vulkan: Fix multi_add invalid descriptor usage (#16899)
|
2 months ago |
Jeff Bolz
|
2e76e01360
vulkan: fuse mul_mat+add and mul_mat_id+add_id (#16868)
|
2 months ago |
Masato Nakasaka
|
2976b0374d
vulkan: Fix crash when FP16 mul_mat accumulation is not supported (#16796)
|
2 months ago |
JJJYmmm
|
d261223d24
model: add support for qwen3vl series (#16780)
|
2 months ago |