Molly Sophia
|
7dfad387e3
llama: Add support for RWKV v7 architecture (#12412)
|
il y a 10 mois |
Jeff Bolz
|
484a8ab513
vulkan: Add N/2 and N/4 optimized paths in coopmat2 shader (#12312)
|
il y a 10 mois |
Daniele
|
cf2270e4d3
vulkan: subgroup size tuning (#12087)
|
il y a 10 mois |
Jeff Bolz
|
891c63956d
vulkan: Pad N dimension of B matrix for coopmat2 perf, to avoid bounds checking (#12273)
|
il y a 10 mois |
Jeff Bolz
|
2f21123c1d
vulkan: Adjust coopmat2 tile sizes and selection heuristic (#12258)
|
il y a 10 mois |
cmdr2
|
0cbee131ad
cuda/vulkan: specify fp32-only support for some operations in supports_op (ggml/1129)
|
il y a 11 mois |
William Tambellini
|
70680c48e5
ggml : upgrade init_tensor API to return a ggml_status (#11854)
|
il y a 11 mois |
Rémy O
|
438a83926a
vulkan: add specific MMV kernels for IQ2 and IQ3 quants + optimizations (#11595)
|
il y a 11 mois |
Jeff Bolz
|
a82c9e7c23
vulkan: fix assertion when qy_needs_dequant (#12068)
|
il y a 11 mois |
Judd
|
c132239bfb
add OP sigmoid (#12056)
|
il y a 11 mois |
Rémy O
|
61d4f39dfe
vulkan: implement more backpropagation operators (#11914)
|
il y a 11 mois |
Rémy O
|
2eea03d86a
vulkan: implement several ops relevant for ggml_opt (#11769)
|
il y a 11 mois |
Jeff Bolz
|
bf42a23d0a
vulkan: support multi/vision rope, and noncontiguous rope (#11902)
|
il y a 11 mois |
Rémy O
|
fc1b0d0936
vulkan: initial support for IQ1_S and IQ1_M quantizations (#11528)
|
il y a 11 mois |
Eve
|
a4f011e8d0
vulkan: linux builds + small subgroup size fixes (#11767)
|
il y a 11 mois |
Danny Milosavljevic
|
c2a67efe38
vulkan: Make Vulkan optional at runtime (#11493). (#11494)
|
il y a 11 mois |
Wagner Bruna
|
b044a0fe3c
vulkan: add environment variable GGML_VK_PREFER_HOST_MEMORY to avoid VRAM allocation (#11592)
|
il y a 11 mois |
Jeff Bolz
|
98f6b0fd1e
vulkan: account for lookup tables when checking shared memory size (#11502)
|
il y a 1 an |
Jeff Bolz
|
c026ba3c23
vulkan: print shared memory size (#11719)
|
il y a 1 an |
Rémy O
|
8a7e3bf17a
vulkan: initial support for IQ4_XS quantization (#11501)
|
il y a 1 an |
Jeff Bolz
|
1b598b3058
vulkan: use smaller combined allocations to avoid fragmentation (#11551)
|
il y a 1 an |
Johannes Gäßler
|
fd08255d0d
CUDA: non-contiguous (RMS) norm support (#11659)
|
il y a 1 an |
Rémy Oudompheng
|
66ee4f297c
vulkan: implement initial support for IQ2 and IQ3 quantizations (#11360)
|
il y a 1 an |
Jeff Bolz
|
2711d0215f
vulkan: Catch pipeline creation failure and print an error message (#11436)
|
il y a 1 an |
Jeff Bolz
|
4a75d19376
vulkan: compile shaders on-demand (#11406)
|
il y a 1 an |
amd-dwang
|
955a6c2d91
Vulkan-run-test: fix mmq_wg_denoms (#11343)
|
il y a 1 an |
Jeff Bolz
|
5245729e33
vulkan: fix diag_mask_inf (#11323)
|
il y a 1 an |
Jeff Bolz
|
aea8ddd516
vulkan: fix coopmat2 validation failures (#11284)
|
il y a 1 an |
Jeff Bolz
|
44e18ef939
vulkan: fix coopmat2 flash attention for non-contiguous inputs (#11281)
|
il y a 1 an |
Jeff Bolz
|
bd38ddea01
vulkan: support copy from f32 to q4_0/q4_1/q5_0/q5_1/q8_0/iq4_nl (#11166)
|
il y a 1 an |