This website works better with JavaScript
Home
Explore
Help
Sign In
cturan
/
llama.cpp
mirror of
https://github.com/cturan/llama.cpp
Watch
1
Star
0
Fork
0
Files
Issues
0
Wiki
Tree:
963552903f
Branches
Tags
k2v2
master
minimax
qwen3_next
qwen3_next_optimized
toolinjection
test
b6814
Commit History
Find
Author
SHA1
Message
Date
Johannes Gäßler
bdcb8f4222
CUDA: int8 tensor cores for MMQ (q4_K, q5_K, q6_K) (
#7860
)
1 year ago
Johannes Gäßler
1f0dabda8d
CUDA: use tensor cores for MMQ (
#7676
)
1 year ago
Johannes Gäßler
42b53d192f
CUDA: revise q8_1 data layout for mul_mat_q (
#7824
)
1 year ago
Johannes Gäßler
7d1a378b8f
CUDA: refactor mmq, dmmv, mmvq (
#7716
)
1 year ago
slaren
ae1f211ce2
cuda : refactor into multiple files (
#6269
)
1 year ago