lhez
|
52e5d421f1
opencl: fix rms_norm_mul (#17250)
|
2 bulan lalu |
shaofeiqi
|
4db5641210
opencl: add kernel to handle mat mul in attention to improve encoding speed (#17181)
|
2 bulan lalu |
lhez
|
ece0f5c177
opencl: add fastdiv and use it in set_rows, ported from cuda (#17090)
|
2 bulan lalu |
Acly
|
1032256ec9
cuda/vulkan : bicubic interpolation (#17022)
|
2 bulan lalu |
lhez
|
c5023daf60
opencl: support imrope (#16914)
|
3 bulan lalu |
Acly
|
10640e31aa
ggml : fix interpolate with align-corners and ne=1 (#16700)
|
3 bulan lalu |
lhez
|
6ea37f5739
opencl: fix warnings and clean up profiling (#16688)
|
3 bulan lalu |
Shawn Gu
|
81387858f1
opencl: transposed gemm/gemv moe kernel with mxfp4,f32 (#16602)
|
3 bulan lalu |
lhez
|
0cb7a0683b
opencl: add q8_0 mm support (#16469)
|
3 bulan lalu |
Aman Gupta
|
120bf7046d
CUDA + openCL: fix bug in accessing rms_norm->src while doing fusion (#16577)
|
3 bulan lalu |
lhez
|
5016b72862
opencl: fix build targeting CL 2 (#16554)
|
3 bulan lalu |
lhez
|
7c156df414
opencl: support pad_ext (#15888)
|
4 bulan lalu |
lhez
|
d1c84a662d
opencl: support ne3 in get_rows (#15866)
|
4 bulan lalu |
Sigbjørn Skjæret
|
3ecb2f671a
ggml : implement set_rows with i32 index (#16159)
|
4 bulan lalu |
lhez
|
51f5a45fbe
opencl: fix concat crash on win arm64 with Adreno (#15944)
|
4 bulan lalu |
lhez
|
c4510dc937
opencl: initial `q8_0` mv support (#15732)
|
4 bulan lalu |
Shawn Gu
|
3edd87cd05
opencl: optimize mxfp4 kernels (#16037)
|
4 bulan lalu |
Jeff Bolz
|
c0b45097c3
rename optimize_graph to graph_optimize (#16082)
|
4 bulan lalu |
Jeff Bolz
|
e68aa10d8f
vulkan: sort graph to allow more parallel execution (#15850)
|
5 bulan lalu |
leejet
|
0a1b3982cd
ggml: add ops for WAN video model (cuda && cpu) (#15669)
|
5 bulan lalu |
rmatif
|
820bc98531
opencl: add hs=40 to FA (#15758)
|
5 bulan lalu |
rmatif
|
97669e4073
opencl: add attn sinks support for FA kernels (#15706)
|
5 bulan lalu |
rmatif
|
86076f92de
OpenCL: add fused group_norm/norm, mul, add (#15314)
|
5 bulan lalu |
lhez
|
f7207b0415
opencl: fix support ops condition for `rms_norm` (#15560)
|
5 bulan lalu |
lhez
|
fb22dd07a6
opencl: mark `argsort` unsupported if cols exceed workgroup limit (#15375)
|
5 bulan lalu |
rmatif
|
912ff8c119
OpenCL: add initial FA support (#14987)
|
5 bulan lalu |
lhez
|
e2c1bfff53
opencl: add initial mxfp4 support via mv (#15270)
|
5 bulan lalu |
rmatif
|
60a7658810
opencl: allow mixed f16/f32 `add` (#15140)
|
5 bulan lalu |
AN Long
|
cd6983d56d
ggml : fix field name when new ggml_backend (#14944)
|
6 bulan lalu |
lhez
|
aaa3d07ae7
opencl: support sink in `soft_max` (attn sinks) (#15152)
|
6 bulan lalu |