R0CKSTAR
|
492d7f1ff7
musa: fix all warnings, re-enable `-DLLAMA_FATAL_WARNINGS=ON` in ci and update doc (#12611)
|
10 months ago |
Johannes Gäßler
|
b9ab0a4d0b
CUDA: use arch list for compatibility check (#11775)
|
11 months ago |
Andreas Kieslinger
|
750cb3e246
CUDA: rename macros to avoid conflicts with WinAPI (#10736)
|
1 year ago |
Djip007
|
19d8762ab6
ggml : refactor online repacking (#10446)
|
1 year ago |
Shupei Fan
|
c202cef168
ggml-cpu: support IQ4_NL_4_4 by runtime repack (#10541)
|
1 year ago |
compilade
|
9bc6db28d0
ggml-quants : ternary packing for TriLMs and BitNet b1.58 (#8151)
|
1 year ago |
R0CKSTAR
|
e54c35e4fb
feat: Support Moore Threads GPU (#8383)
|
1 year ago |
Dibakar Gope
|
0f1a39f343
ggml : add AArch64 optimized GEMV and GEMM Q4 kernels (#5780)
|
1 year ago |
Johannes Gäßler
|
cb5fad4c6c
CUDA: refactor and optimize IQ MMVQ (#8215)
|
1 year ago |
Georgi Gerganov
|
f3f65429c4
llama : reorganize source code + improve CMake (#8006)
|
1 year ago |