Shin-myoung-serp
|
0014fb4add
ggml vulkan: add hardsigmoid and hardswish operations (#15762)
|
5 months ago |
Jeff Bolz
|
25f1045f07
vulkan: Fix macro parameter order for f32 matmul shaders (#15716)
|
5 months ago |
Gilad S.
|
d4d8dbe383
vulkan: use memory budget extension to read memory usage (#15545)
|
5 months ago |
Ruben Ortlam
|
fec7911f8f
vulkan: disable large mmv subgroups on older Nvidia GPUs (#15717)
|
5 months ago |
Ruben Ortlam
|
02c1813517
Vulkan: Add Integer Dot Product mul_mat_vec shader for legacy quants (#14903)
|
5 months ago |
Jeff Bolz
|
bbbf5ecccb
vulkan: handle large sizes for get_rows (#15686)
|
5 months ago |
Jeff Bolz
|
c37052ab4d
vulkan: mul_mat_id coopmat2 optimizations (#15546)
|
5 months ago |
Daniel Bevenius
|
5c16b9c87d
vulkan : remove unused portability_enumeration_ext variable (#15679)
|
5 months ago |
Jeff Bolz
|
b97c9edc59
vulkan: Allow fallback to sysmem memory when vidmem is full (#15649)
|
5 months ago |
Jeff Bolz
|
696fccf354
vulkan: Skip syncing for prealloc_y when it is reused (#15544)
|
5 months ago |
Jeff Bolz
|
34bdbbd7c2
vulkan: Remove splitting for mul_mat_id (#15568)
|
5 months ago |
Ruben Ortlam
|
4d917cd4f6
vulkan: fix min subgroup 16 condition for mmid subgroup optimization (#15565)
|
5 months ago |
Ruben Ortlam
|
043fb27d38
vulkan: apply MUL_MAT_ID subgroup optimization to non-coopmat devices (#15524)
|
5 months ago |
Jeff Bolz
|
c9a24fb932
vulkan: Support FA with any multiple of 8 head sizes (#15537)
|
5 months ago |
Ruben Ortlam
|
a9c6ffcbfa
vulkan: enable Conv2D for Apple after MoltenVK fixed the bug (#15526)
|
5 months ago |
Jeff Bolz
|
611f419cff
vulkan: optimize rms_norm, and allow the work to spread across multiple SMs (#15281)
|
5 months ago |
Jeff Bolz
|
289bf4113e
vulkan: Rewrite synchronization to allow some overlap between nodes (#15489)
|
5 months ago |
Acly
|
0a9b43e507
vulkan : support ggml_mean (#15393)
|
5 months ago |
Jeff Bolz
|
330c3d2d21
vulkan: optimize mul_mat_id loading row ids into shared memory (#15427)
|
5 months ago |
Acly
|
97ae5961a4
vulkan : support conv_2d_dw with f16 weights (#15392)
|
5 months ago |
Dong Won Kim
|
20c2dac8c6
vulkan: add exp operation (#15456)
|
5 months ago |
Jeff Bolz
|
96452a3fa4
vulkan: Reuse conversion results in prealloc_y (#15410)
|
5 months ago |
Jeff Bolz
|
fec9519802
vulkan: shorten pipeline name strings (#15431)
|
5 months ago |
Jeff Bolz
|
21c17b5bef
vulkan: Use larger workgroups for mul_mat_vec when M is small (#15355)
|
5 months ago |
Dong Won Kim
|
19f4decae0
vulkan: support sqrt (#15370)
|
5 months ago |
Jeff Bolz
|
de5627910d
vulkan: Optimize argsort (#15354)
|
5 months ago |
Jeff Bolz
|
1fe00296f5
vulkan: fuse adds (#15252)
|
5 months ago |
Jeff Bolz
|
de2192794f
vulkan: Support mul_mat_id with f32 accumulators (#15337)
|
5 months ago |
Georgi Gerganov
|
5edf1592fd
vulkan : fix out-of-bounds access in argmax kernel (#15342)
|
5 months ago |
Georgi Gerganov
|
db3010bd23
vulkan : fix compile warnings on macos (#15340)
|
5 months ago |