Jeff Bolz
|
3229a23fa6
vulkan: support GGML_OP_DIAG (#17893)
|
1 month ago |
Jeff Bolz
|
303f8615e9
vulkan: Multi-pass softmax for large number of cols (#17892)
|
1 month ago |
Jeff Bolz
|
07a10c1090
vulkan: Allow non-pow2 n_experts in topk_moe (#17872)
|
1 month ago |
Jeff Bolz
|
db97837385
vulkan: perf_logger improvements (#17672)
|
1 month ago |
Phylliida Dev
|
09c7c50e64
ggml : add circular tiling support to pad, for Vulkan, CUDA, and CPU (used for making seamless textures) (#16985)
|
1 month ago |
Jeff Bolz
|
2960eb2975
vulkan: Use one row per workgroup for f32 mmv (#17711)
|
1 month ago |
Jeff Bolz
|
c6c5e85979
vulkan: support solve_tri with larger N/K values (#17781)
|
1 month ago |
Masato Nakasaka
|
67788f6846
vulkan: Replace deprecated VK_EXT_validation_features (#17637)
|
1 month ago |
Masato Nakasaka
|
d8c0a7b085
vulkan: Fix mismatch in TOPK_MOE unit test (#17541)
|
1 month ago |
Jeff Bolz
|
a0f3897d53
vulkan: fix top_k bug when there are ties in the input (#17659)
|
1 month ago |
Acly
|
e15cd06a94
vulkan : support conv-2d with large output size (#17685)
|
1 month ago |
Jeff Bolz
|
6ab0d64960
vulkan: enable mmvq for q2_k on NVIDIA (#17675)
|
1 month ago |
Jeff Bolz
|
93bb92664e
vulkan: set all memory allocations to high priority (#17624)
|
1 month ago |
Jeff Bolz
|
61bde8e21f
vulkan: Reduce temporary memory usage for TOP_K (#17623)
|
1 month ago |
Tarek Dakhran
|
2ba719519d
model: LFM2-VL fixes (#17577)
|
1 month ago |
Ruben Ortlam
|
47a268ea50
Vulkan: MMVQ Integer Dot K-Quant and MUL_MAT_ID support (#16900)
|
1 month ago |
Jeff Bolz
|
59d8d4e963
vulkan: improve topk perf for large k, fix overflow in unit tests (#17582)
|
1 month ago |
Jeff Bolz
|
35cf8887e1
vulkan: Implement GGML_OP_TRI (#17503)
|
1 month ago |
Jeff Bolz
|
4abef75f2c
vulkan: Implement SOLVE_TRI (#17486)
|
2 months ago |
Acly
|
b78db3bd50
vulkan : move contiguous checks to device_supports_op (#17490)
|
2 months ago |
Jeff Bolz
|
142df17c9c
vulkan: use a fixed 1KB buffer for the add_rms_fusion opt (#17514)
|
2 months ago |
Jeff Bolz
|
eec1e33a9e
vulkan: allow graph_optimize for prompt processing workloads (#17475)
|
2 months ago |
Jeff Bolz
|
879d673759
vulkan: Implement top-k (#17418)
|
2 months ago |
Jeff Bolz
|
b3b03a7baf
vulkan: Implement GGML_OP_CUMSUM (#17479)
|
2 months ago |
Jeff Bolz
|
d414db02d3
vulkan: Use fewer rows for scalar FA when HS is not a multiple of 16 (#17455)
|
2 months ago |
Jeff Bolz
|
3d07caa99b
vulkan: more FA details in vk_perf_logger (#17443)
|
2 months ago |
Jeff Bolz
|
54d83bbe85
vulkan: remove a couple unnecessary switches (#17419)
|
2 months ago |
Jeff Bolz
|
f1ffbba68e
vulkan: disable async for older Intel devices (#17369)
|
2 months ago |
Giuseppe Scrivano
|
7d77f07325
vulkan: implement ADD1, ARANGE, FILL, SOFTPLUS, STEP, ROUND, CEIL, FLOOR, TRUNC (#17319)
|
2 months ago |
Jeff Bolz
|
1fa4551af0
vulkan: support larger argsort (#17313)
|
2 months ago |