Radoslav Gerganov
|
af6f91db47
ggml : remove ggml_graph_import and ggml_graph_export declarations (ggml/1247)
|
7 months ago |
Max Krasnyansky
|
053b1539c0
threading: support for GGML_SCHED_PRIO_LOW, update thread info on Windows to avoid throttling (#12995)
|
7 months ago |
Xuan-Son Nguyen
|
a8ea03d8ad
ggml : add ggml_repeat_4d (#13824)
|
8 months ago |
Xuan-Son Nguyen
|
e16c4731c7
ggml : fix the order of ggml_unary_op (#13718)
|
8 months ago |
Xuan-Son Nguyen
|
cf4cb59e64
ggml : add ggml_gelu_erf() (#13667)
|
8 months ago |
Johannes Gäßler
|
10d2af0eaa
llama/ggml: add LLM training support (#10544)
|
8 months ago |
Johannes Gäßler
|
2356fb1d53
CUDA: fix bad asserts for partial offload (#13337)
|
8 months ago |
Johannes Gäßler
|
69699be48a
CUDA: fix q_nope_absorbed prec for DS 2 Lite f16 (#13137)
|
9 months ago |
Acly
|
c6e8cc28c1
ggml : Depthwise 2D convolution (ggml/1152)
|
9 months ago |
Diego Devesa
|
fe92821ea9
ggml : add bilinear upscale support (ggml/1185)
|
9 months ago |
Diego Devesa
|
459895c326
ggml : add more generic custom op, remove deprecated custom ops (ggml/1183)
|
9 months ago |
Georgi Gerganov
|
b4ae50810e
metal : improve FA + improve MoE (#12612)
|
10 months ago |
Molly Sophia
|
7dfad387e3
llama: Add support for RWKV v7 architecture (#12412)
|
10 months ago |
mgroeber9110
|
5bbe6a9fe9
ggml : portability fixes for VS 2017 (#12150)
|
10 months ago |
bandoti
|
fef0cbeadf
cleanup: fix compile warnings associated with gnu_printf (#11811)
|
11 months ago |
Johannes Gäßler
|
864a0b67a6
CUDA: use mma PTX instructions for FlashAttention (#11583)
|
11 months ago |
Johannes Gäßler
|
9c8dcefe17
CUDA: backwards pass for misc. ops, add tests (#11257)
|
1 year ago |
Johannes Gäßler
|
432df2d5f9
RoPE: fix back, CUDA support for back + noncont. (#11240)
|
1 year ago |
Molly Sophia
|
ee7136c6d1
llama: add support for QRWKV6 model architecture (#11001)
|
1 year ago |
Johannes Gäßler
|
53ff6b9b9f
GGUF: C++ refactor, backend support, misc fixes (#11030)
|
1 year ago |
Georgi Gerganov
|
0bf2d10c55
tts : add OuteTTS support (#10784)
|
1 year ago |
HimariO
|
ba1cb19cdd
llama : add Qwen2VL support + multimodal RoPE (#10361)
|
1 year ago |
Djip007
|
19d8762ab6
ggml : refactor online repacking (#10446)
|
1 year ago |
PAB
|
c2082d93a8
ggml : add `GGML_PAD_REFLECT_1D` operation (ggml/1034)
|
1 year ago |
Shupei Fan
|
c202cef168
ggml-cpu: support IQ4_NL_4_4 by runtime repack (#10541)
|
1 year ago |
Diego Devesa
|
5931c1f233
ggml : add support for dynamic loading of backends (#10469)
|
1 year ago |
Johannes Gäßler
|
8a43e940ab
ggml: new optimization interface (ggml/988)
|
1 year ago |
Diego Devesa
|
ae8de6d50a
ggml : build backends as libraries (#10256)
|
1 year ago |
Georgi Gerganov
|
841f27abdb
metal : optimize FA kernels (#10171)
|
1 year ago |
Zhiyuan Li
|
3bcd40b3c5
Optimize RWKV6 Operator Naming and Implement Multi-core CPU/ SYCL Acceleration (#10133)
|
1 year ago |