Jonathan Graehl
|
5cdb27e091
finetune: SGD optimizer, more CLI args (#13873)
|
5 meses atrás |
Oliver Simons
|
6028bf7435
CUDA: Optimize `reduce_rows_f32` kernel, leading up to 25x perf improvement on kernel-level and 10% perf increase for Gemma3n (#15132)
|
5 meses atrás |
Georgi Gerganov
|
fd1234cb46
llama : add gpt-oss (#15091)
|
5 meses atrás |
Jeff Bolz
|
ec0b18802c
vulkan: Support ne[3]>1 in noncontig matrix-vector multiply (#15015)
|
5 meses atrás |
Sigbjørn Skjæret
|
138b288b59
cuda : add softcap fusion (#14907)
|
5 meses atrás |
Leonard Mosescu
|
bda62193b2
test-backend-ops : extend test case filtering (#14865)
|
5 meses atrás |
Erik Scholz
|
89d1029559
vulkan : add fp16 support for the conv_2d kernel (#14872)
|
5 meses atrás |
Aman Gupta
|
446595b9b3
Docs: add instructions for adding backends (#14889)
|
5 meses atrás |
Georgi Gerganov
|
18f3b5ff9e
tests : add non-cont K,V FA tests
|
6 meses atrás |
Aman Gupta
|
8c988fa41d
CUDA: add fused rms norm (#14800)
|
5 meses atrás |
Jeff Bolz
|
c2e058f1b4
vulkan/cuda: Fix im2col when KW!=KH (#14789)
|
6 meses atrás |
Ervin Áron Tasnádi
|
a979ca22db
ggml: adds CONV_2D op and direct GEMM Vulkan implementation (#14316)
|
6 meses atrás |
Georgi Gerganov
|
bf9087f59a
metal : fuse add, mul + add tests (#14596)
|
6 meses atrás |
Georgi Gerganov
|
225e7a1438
llama : add high-throughput mode (#14363)
|
6 meses atrás |
Tarek Dakhran
|
c31e60647d
tests : cover lfm2 cases in test_ssm_conv (#14651)
|
6 meses atrás |
Acly
|
3e303b1107
vulkan : implement ggml_roll (ggml/1290)
|
6 meses atrás |
Aman Gupta
|
11ee0fea2a
Docs: script to auto-generate ggml operations docs (#14598)
|
6 meses atrás |
compilade
|
a57d1bcb3c
cuda : support Falcon-H1 state size for SSM_SCAN (#14602)
|
6 meses atrás |
Xuan-Son Nguyen
|
98bab638fb
ggml : add ggml_scale_bias (#14417)
|
6 meses atrás |
Georgi Gerganov
|
4d0dcd4a06
cuda : fix rope with partial rotation and non-cont src (#14580)
|
6 meses atrás |
Jeff Bolz
|
e592be1575
vulkan: fix rms_norm+mul fusion (#14545)
|
6 meses atrás |
R0CKSTAR
|
b81510a7b7
test-backend-ops: add support for specifying output format (#14368)
|
6 meses atrás |
Johannes Gäßler
|
c8c4495b8d
ggml: backward pass for split swiglu (#14483)
|
6 meses atrás |
Georgi Gerganov
|
9067487c44
ggml : fix FA mask dim 2 and 3 (#14505)
|
6 meses atrás |
Aman Gupta
|
55c2646b45
CUDA: add dynamic shared mem to softmax, refactor general usage (#14497)
|
6 meses atrás |
compilade
|
5d46babdc2
llama : initial Mamba-2 support (#9126)
|
6 meses atrás |
Georgi Gerganov
|
ec68e84c32
ggml : support bcast ggml_soft_max_ext, ggml_flash_attn_ext (#14435)
|
6 meses atrás |
Jeff Bolz
|
6a746cf9c4
vulkan: Split large mul_mat_id to fit in shared memory (#14451)
|
6 meses atrás |
Acly
|
431b2c24f3
ggml-cpu : "align corners" for bilinear upscale/downscale (ggml/1285)
|
6 meses atrás |
Diego Devesa
|
eb3fa2913e
test-backend-ops : disable llama test (#14461)
|
6 meses atrás |