Wagner Bruna
|
2bb3597e42
vulkan: fix build when glslc doesn't support coopmat (#12683)
|
il y a 9 mois |
0cc4m
|
a8a1f33567
Vulkan: Add DP4A MMQ and Q8_1 quantization shader (#12135)
|
il y a 9 mois |
Georgi Gerganov
|
b4ae50810e
metal : improve FA + improve MoE (#12612)
|
il y a 10 mois |
Jeff Bolz
|
eddfb43850
vulkan: Optimize mul_mat_vec p021 and nc shaders (#12505)
|
il y a 10 mois |
stduhpf
|
4375415b4a
Vulkan: RTE rounding for cpy to quant (#12480)
|
il y a 10 mois |
Jeff Bolz
|
c446b2edd2
vulkan: Submit once enough matmul work has been recorded (#12406)
|
il y a 10 mois |
0cc4m
|
fd123cfead
Vulkan: Default to 1GB allocations instead of 4GB to avoid fragmentation and driver issues (#12434)
|
il y a 10 mois |
Molly Sophia
|
7dfad387e3
llama: Add support for RWKV v7 architecture (#12412)
|
il y a 10 mois |
Jeff Bolz
|
484a8ab513
vulkan: Add N/2 and N/4 optimized paths in coopmat2 shader (#12312)
|
il y a 10 mois |
Daniele
|
cf2270e4d3
vulkan: subgroup size tuning (#12087)
|
il y a 10 mois |
Jeff Bolz
|
891c63956d
vulkan: Pad N dimension of B matrix for coopmat2 perf, to avoid bounds checking (#12273)
|
il y a 10 mois |
Jeff Bolz
|
2f21123c1d
vulkan: Adjust coopmat2 tile sizes and selection heuristic (#12258)
|
il y a 10 mois |
cmdr2
|
0cbee131ad
cuda/vulkan: specify fp32-only support for some operations in supports_op (ggml/1129)
|
il y a 10 mois |
William Tambellini
|
70680c48e5
ggml : upgrade init_tensor API to return a ggml_status (#11854)
|
il y a 10 mois |
Rémy O
|
438a83926a
vulkan: add specific MMV kernels for IQ2 and IQ3 quants + optimizations (#11595)
|
il y a 10 mois |
Jeff Bolz
|
a82c9e7c23
vulkan: fix assertion when qy_needs_dequant (#12068)
|
il y a 11 mois |
Judd
|
c132239bfb
add OP sigmoid (#12056)
|
il y a 11 mois |
Rémy O
|
61d4f39dfe
vulkan: implement more backpropagation operators (#11914)
|
il y a 11 mois |
Rémy O
|
2eea03d86a
vulkan: implement several ops relevant for ggml_opt (#11769)
|
il y a 11 mois |
Jeff Bolz
|
bf42a23d0a
vulkan: support multi/vision rope, and noncontiguous rope (#11902)
|
il y a 11 mois |
Rémy O
|
fc1b0d0936
vulkan: initial support for IQ1_S and IQ1_M quantizations (#11528)
|
il y a 11 mois |
Eve
|
a4f011e8d0
vulkan: linux builds + small subgroup size fixes (#11767)
|
il y a 11 mois |
Danny Milosavljevic
|
c2a67efe38
vulkan: Make Vulkan optional at runtime (#11493). (#11494)
|
il y a 11 mois |
Wagner Bruna
|
b044a0fe3c
vulkan: add environment variable GGML_VK_PREFER_HOST_MEMORY to avoid VRAM allocation (#11592)
|
il y a 11 mois |
Jeff Bolz
|
98f6b0fd1e
vulkan: account for lookup tables when checking shared memory size (#11502)
|
il y a 11 mois |
Jeff Bolz
|
c026ba3c23
vulkan: print shared memory size (#11719)
|
il y a 11 mois |
Rémy O
|
8a7e3bf17a
vulkan: initial support for IQ4_XS quantization (#11501)
|
il y a 11 mois |
Jeff Bolz
|
1b598b3058
vulkan: use smaller combined allocations to avoid fragmentation (#11551)
|
il y a 11 mois |
Johannes Gäßler
|
fd08255d0d
CUDA: non-contiguous (RMS) norm support (#11659)
|
il y a 11 mois |
Rémy Oudompheng
|
66ee4f297c
vulkan: implement initial support for IQ2 and IQ3 quantizations (#11360)
|
il y a 11 mois |