Johannes Gäßler 69c487f4ed CUDA: MMQ code deduplication + iquant support (#8495) 1 year ago
..
template-instances 69c487f4ed CUDA: MMQ code deduplication + iquant support (#8495) 1 year ago
acc.cu f3f65429c4 llama : reorganize source code + improve CMake (#8006) 1 year ago
acc.cuh f3f65429c4 llama : reorganize source code + improve CMake (#8006) 1 year ago
arange.cu f3f65429c4 llama : reorganize source code + improve CMake (#8006) 1 year ago
arange.cuh f3f65429c4 llama : reorganize source code + improve CMake (#8006) 1 year ago
argsort.cu f3f65429c4 llama : reorganize source code + improve CMake (#8006) 1 year ago
argsort.cuh f3f65429c4 llama : reorganize source code + improve CMake (#8006) 1 year ago
binbcast.cu f3f65429c4 llama : reorganize source code + improve CMake (#8006) 1 year ago
binbcast.cuh f3f65429c4 llama : reorganize source code + improve CMake (#8006) 1 year ago
clamp.cu f3f65429c4 llama : reorganize source code + improve CMake (#8006) 1 year ago
clamp.cuh f3f65429c4 llama : reorganize source code + improve CMake (#8006) 1 year ago
common.cuh b078c619aa cuda : suppress 'noreturn' warn in no_device_code (#8414) 1 year ago
concat.cu f3f65429c4 llama : reorganize source code + improve CMake (#8006) 1 year ago
concat.cuh f3f65429c4 llama : reorganize source code + improve CMake (#8006) 1 year ago
conv-transpose-1d.cu fde13b3bb9 feat: cuda implementation for `ggml_conv_transpose_1d` (ggml/854) 1 year ago
conv-transpose-1d.cuh fde13b3bb9 feat: cuda implementation for `ggml_conv_transpose_1d` (ggml/854) 1 year ago
convert.cu f3f65429c4 llama : reorganize source code + improve CMake (#8006) 1 year ago
convert.cuh f3f65429c4 llama : reorganize source code + improve CMake (#8006) 1 year ago
cpy.cu 07a3fc0608 Removes multiple newlines at the end of files that is breaking the editorconfig step of CI. (#8258) 1 year ago
cpy.cuh f3f65429c4 llama : reorganize source code + improve CMake (#8006) 1 year ago
dequantize.cuh f3f65429c4 llama : reorganize source code + improve CMake (#8006) 1 year ago
diagmask.cu f3f65429c4 llama : reorganize source code + improve CMake (#8006) 1 year ago
diagmask.cuh f3f65429c4 llama : reorganize source code + improve CMake (#8006) 1 year ago
dmmv.cu f3f65429c4 llama : reorganize source code + improve CMake (#8006) 1 year ago
dmmv.cuh f3f65429c4 llama : reorganize source code + improve CMake (#8006) 1 year ago
fattn-common.cuh 8e558309dc CUDA: MMQ support for iq4_nl, iq4_xs (#8278) 1 year ago
fattn-tile-f16.cu f3f65429c4 llama : reorganize source code + improve CMake (#8006) 1 year ago
fattn-tile-f16.cuh f3f65429c4 llama : reorganize source code + improve CMake (#8006) 1 year ago
fattn-tile-f32.cu f3f65429c4 llama : reorganize source code + improve CMake (#8006) 1 year ago
fattn-tile-f32.cuh f3f65429c4 llama : reorganize source code + improve CMake (#8006) 1 year ago
fattn-vec-f16.cuh f3f65429c4 llama : reorganize source code + improve CMake (#8006) 1 year ago
fattn-vec-f32.cuh f3f65429c4 llama : reorganize source code + improve CMake (#8006) 1 year ago
fattn-wmma-f16.cuh f3f65429c4 llama : reorganize source code + improve CMake (#8006) 1 year ago
fattn.cu f3f65429c4 llama : reorganize source code + improve CMake (#8006) 1 year ago
fattn.cuh f3f65429c4 llama : reorganize source code + improve CMake (#8006) 1 year ago
getrows.cu f3f65429c4 llama : reorganize source code + improve CMake (#8006) 1 year ago
getrows.cuh f3f65429c4 llama : reorganize source code + improve CMake (#8006) 1 year ago
im2col.cu f3f65429c4 llama : reorganize source code + improve CMake (#8006) 1 year ago
im2col.cuh f3f65429c4 llama : reorganize source code + improve CMake (#8006) 1 year ago
mma.cuh 808aba3916 CUDA: optimize and refactor MMQ (#8416) 1 year ago
mmq.cu 69c487f4ed CUDA: MMQ code deduplication + iquant support (#8495) 1 year ago
mmq.cuh 69c487f4ed CUDA: MMQ code deduplication + iquant support (#8495) 1 year ago
mmvq.cu cb5fad4c6c CUDA: refactor and optimize IQ MMVQ (#8215) 1 year ago
mmvq.cuh f3f65429c4 llama : reorganize source code + improve CMake (#8006) 1 year ago
norm.cu f3f65429c4 llama : reorganize source code + improve CMake (#8006) 1 year ago
norm.cuh f3f65429c4 llama : reorganize source code + improve CMake (#8006) 1 year ago
pad.cu f3f65429c4 llama : reorganize source code + improve CMake (#8006) 1 year ago
pad.cuh f3f65429c4 llama : reorganize source code + improve CMake (#8006) 1 year ago
pool2d.cu f3f65429c4 llama : reorganize source code + improve CMake (#8006) 1 year ago
pool2d.cuh f3f65429c4 llama : reorganize source code + improve CMake (#8006) 1 year ago
quantize.cu 808aba3916 CUDA: optimize and refactor MMQ (#8416) 1 year ago
quantize.cuh 808aba3916 CUDA: optimize and refactor MMQ (#8416) 1 year ago
rope.cu f3f65429c4 llama : reorganize source code + improve CMake (#8006) 1 year ago
rope.cuh f3f65429c4 llama : reorganize source code + improve CMake (#8006) 1 year ago
scale.cu f3f65429c4 llama : reorganize source code + improve CMake (#8006) 1 year ago
scale.cuh f3f65429c4 llama : reorganize source code + improve CMake (#8006) 1 year ago
softmax.cu f3f65429c4 llama : reorganize source code + improve CMake (#8006) 1 year ago
softmax.cuh f3f65429c4 llama : reorganize source code + improve CMake (#8006) 1 year ago
sumrows.cu f3f65429c4 llama : reorganize source code + improve CMake (#8006) 1 year ago
sumrows.cuh f3f65429c4 llama : reorganize source code + improve CMake (#8006) 1 year ago
tsembd.cu f3f65429c4 llama : reorganize source code + improve CMake (#8006) 1 year ago
tsembd.cuh f3f65429c4 llama : reorganize source code + improve CMake (#8006) 1 year ago
unary.cu f3f65429c4 llama : reorganize source code + improve CMake (#8006) 1 year ago
unary.cuh f3f65429c4 llama : reorganize source code + improve CMake (#8006) 1 year ago
upscale.cu f3f65429c4 llama : reorganize source code + improve CMake (#8006) 1 year ago
upscale.cuh f3f65429c4 llama : reorganize source code + improve CMake (#8006) 1 year ago
vecdotq.cuh 69c487f4ed CUDA: MMQ code deduplication + iquant support (#8495) 1 year ago