Johannes Gäßler
|
8137b4bb2b
CPU/CUDA: fix (GQA) mul mat back, add CUDA support (#11380)
|
1 year ago |
Johannes Gäßler
|
9c8dcefe17
CUDA: backwards pass for misc. ops, add tests (#11257)
|
1 year ago |
Johannes Gäßler
|
432df2d5f9
RoPE: fix back, CUDA support for back + noncont. (#11240)
|
1 year ago |
Molly Sophia
|
ee7136c6d1
llama: add support for QRWKV6 model architecture (#11001)
|
1 year ago |
Johannes Gäßler
|
53ff6b9b9f
GGUF: C++ refactor, backend support, misc fixes (#11030)
|
1 year ago |
Georgi Gerganov
|
0bf2d10c55
tts : add OuteTTS support (#10784)
|
1 year ago |
Johannes Gäßler
|
081b29bd2a
tests: add tests for GGUF (#10830)
|
1 year ago |
Daniel Bevenius
|
3919da8e33
ggml : add check for grad_accs (ggml/1046)
|
1 year ago |
HimariO
|
ba1cb19cdd
llama : add Qwen2VL support + multimodal RoPE (#10361)
|
1 year ago |
Djip007
|
19d8762ab6
ggml : refactor online repacking (#10446)
|
1 year ago |
PAB
|
c2082d93a8
ggml : add `GGML_PAD_REFLECT_1D` operation (ggml/1034)
|
1 year ago |
Shupei Fan
|
c202cef168
ggml-cpu: support IQ4_NL_4_4 by runtime repack (#10541)
|
1 year ago |
Diego Devesa
|
5931c1f233
ggml : add support for dynamic loading of backends (#10469)
|
1 year ago |
Diego Devesa
|
a5e47592b6
cuda : optimize argmax (#10441)
|
1 year ago |
Johannes Gäßler
|
02e4eaf22f
ggml-opt: fix data corruption (ggml/1022)
|
1 year ago |
Georgi Gerganov
|
68fcb4759c
ggml : fix compile warnings (#0)
|
1 year ago |
Johannes Gäßler
|
8a43e940ab
ggml: new optimization interface (ggml/988)
|
1 year ago |
slaren
|
883d206fbd
ggml : fix some build issues
|
1 year ago |
Diego Devesa
|
ae8de6d50a
ggml : build backends as libraries (#10256)
|
1 year ago |
Georgi Gerganov
|
841f27abdb
metal : optimize FA kernels (#10171)
|
1 year ago |
Zhiyuan Li
|
3bcd40b3c5
Optimize RWKV6 Operator Naming and Implement Multi-core CPU/ SYCL Acceleration (#10133)
|
1 year ago |
Georgi Gerganov
|
1dc04b2dee
ggml : adjust is_first_call init value (#10193)
|
1 year ago |
Diego Devesa
|
a9e8a9a030
ggml : fix arch check in bf16_to_fp32 (#10164)
|
1 year ago |
Diego Devesa
|
401558b7ba
ggml : fix q4xx mat mul, increase ggml_aligned_malloc alignment (#10167)
|
1 year ago |
Diego Devesa
|
9f40989351
ggml : move CPU backend to a separate file (#10144)
|
1 year ago |
Georgi Gerganov
|
1804adb0cf
ggml : remove ggml_scratch (#10121)
|
1 year ago |
Georgi Gerganov
|
f221d56220
ggml : alloc ggml_contexts on the heap (whisper/2525)
|
1 year ago |
Diego Devesa
|
c02e5ab2a6
llama : fix buffer checks for mamba and rwk (#10111)
|
1 year ago |
Diego Devesa
|
dea5e86051
ggml : check tensor name lengths in gguf files (#10100)
|
1 year ago |
Diego Devesa
|
b9e02e8184
ggml : fix memory leaks when loading invalid gguf files (#10094)
|
1 year ago |