Aman Gupta
|
077c94d0ca
CUDA: add a fused top-K MoE kernel (#16130)
|
hace 3 meses |
Georgi Gerganov
|
dfcd53f7ec
metal : fuse NORM + MUL + ADD, support non-multiples of 4 (#16220)
|
hace 3 meses |
Sigbjørn Skjæret
|
3ecb2f671a
ggml : implement set_rows with i32 index (#16159)
|
hace 3 meses |
Shin-myoung-serp
|
96fdca043b
Vulkan: add conv_transpose_2d operation (#16022)
|
hace 3 meses |
Ruben Ortlam
|
9073a73d82
vulkan: vec dot matrix multiplication fix (#16151)
|
hace 3 meses |
Xuan-Son Nguyen
|
0dd58b6877
ggml : refactor forward_dup for cpu backend (#16062)
|
hace 4 meses |
Bowen Han
|
38dbdf4c05
CUDA: Optimize PAD_REFLECT_1D (#15957)
|
hace 4 meses |
Reese Levine
|
d304f459d8
GGML WebGPU: Support for ADD, MUL, RMS_NORM, GET_ROWS operators (#16018)
|
hace 4 meses |
Georgi Gerganov
|
0320ac5264
metal : refactor + optimize v2 (#15995)
|
hace 4 meses |
Oliver Simons
|
00681dfc16
CUDA: Add `fastdiv` to `k_bin_bcast*`, giving 1-3% E2E performance (#15872)
|
hace 4 meses |
Daniel Bevenius
|
e7b6d83b52
tests : filter out no-ops from coverage report (#15900)
|
hace 4 meses |
Jeff Bolz
|
4f63cd705c
vulkan: Fix OOB accesses in soft_max_back (#15861)
|
hace 4 meses |
Aman Gupta
|
a972faebed
CUDA: Add mul_mat_id support for the mmf kernel (#15767)
|
hace 4 meses |
Georgi Gerganov
|
f28d4f4ac9
metal : refactor + optimize (#15857)
|
hace 4 meses |
Xuan-Son Nguyen
|
9fcb29f22f
ggml: allow casting between f32 and i32 (#15783)
|
hace 4 meses |
Jeff Bolz
|
d413dca003
tests: large sizes for get_rows (#15687)
|
hace 4 meses |
Jeff Bolz
|
3976dfbe00
vulkan: support im2col_3d (#15795)
|
hace 4 meses |
Jeff Bolz
|
c97b5e5854
vulkan: Support pad_ext (#15794)
|
hace 4 meses |
Daniel Bevenius
|
3a550b5ca4
tests : add --list-ops and --show-coverage options (#15745)
|
hace 4 meses |
leejet
|
0a1b3982cd
ggml: add ops for WAN video model (cuda && cpu) (#15669)
|
hace 4 meses |
rmatif
|
86076f92de
OpenCL: add fused group_norm/norm, mul, add (#15314)
|
hace 4 meses |
Eve
|
44b1efa41a
tests: add performance test for mul mat id (#15543)
|
hace 4 meses |
Georgi Gerganov
|
1d8d83deaa
metal : improve `MUL_MAT_ID` (#15541)
|
hace 4 meses |
Jeff Bolz
|
34bdbbd7c2
vulkan: Remove splitting for mul_mat_id (#15568)
|
hace 4 meses |
Jeff Bolz
|
886b97a5d6
tests: Generate unique input values for count_equal (#15487)
|
hace 4 meses |
Jeff Bolz
|
c9a24fb932
vulkan: Support FA with any multiple of 8 head sizes (#15537)
|
hace 4 meses |
Jeff Bolz
|
611f419cff
vulkan: optimize rms_norm, and allow the work to spread across multiple SMs (#15281)
|
hace 4 meses |
Acly
|
0a9b43e507
vulkan : support ggml_mean (#15393)
|
hace 4 meses |
rmatif
|
92f7f0a53c
ggml: add `conv3d` op (#15182)
|
hace 4 meses |
Jeff Bolz
|
96452a3fa4
vulkan: Reuse conversion results in prealloc_y (#15410)
|
hace 4 meses |