Daniel Bevenius
|
a80ff183ab
ggml-cpu : fix leftover handling in ggml_vec_scale_f32 for SVE (#16443)
|
3 月之前 |
Yuannan
|
1d49ca3759
nix : removed metal for nix (#16118)
|
3 月之前 |
Oleksandr Kuvshynov
|
c5fef0fcea
server: update readme to mention n_past_max metric (#16436)
|
3 月之前 |
Gabe Goodhart
|
ca71fb9b36
model : Granite docling + Idefics3 preprocessing (SmolVLM) (#16206)
|
3 月之前 |
Reese Levine
|
35266573b9
ggml webgpu: actually add softmax, fix rms_norm offset (#16400)
|
3 月之前 |
Eve
|
86df2c9ae4
vulkan: use a more appropriate amount of threads when generating shaders (#16418)
|
3 月之前 |
Radoslav Gerganov
|
f39283960b
rpc : check src buffer when copying tensor (#16421)
|
3 月之前 |
Radoslav Gerganov
|
898acba681
rpc : add support for multiple devices (#16276)
|
3 月之前 |
Acly
|
e29acf74fe
vulkan : incremental shader builds (#16341)
|
3 月之前 |
Pascal
|
128d522c04
chat : support Magistral thinking (#16413)
|
3 月之前 |
ddh0
|
f6dcda3900
server : context checkpointing for hybrid and recurrent models (#16382)
|
3 月之前 |
Georgi Gerganov
|
606a73f531
metal : fix loop bound in ggml_mem_ranges (#16412)
|
3 月之前 |
Sigbjørn Skjæret
|
946f71ed9a
llama : fix shapes for bert/mpt q/k norm (#16409)
|
3 月之前 |
Acly
|
638d330246
ggml : fix graph reallocation with multiple chunks (#16396)
|
3 月之前 |
Aleksander Grygier
|
84c8e305e8
Fix missing messages on sibling navigation (#16408)
|
3 月之前 |
Jeff Bolz
|
2aaf0a2a20
vulkan: Replace uses of maxMemoryAllocationSize and VK_WHOLE_SIZE (#16354)
|
3 月之前 |
Jeff Bolz
|
0e1f838556
vulkan: Fix FA coopmat1 invalid array indexing (#16365)
|
3 月之前 |
Daniel Bevenius
|
ad126479c2
ci : change macos-13 to macos-15-intel (#16401)
|
3 月之前 |
Aleksander Grygier
|
77233277c9
Capture model name only after first token (streaming) or completed request (#16405)
|
3 月之前 |
Jeff Bolz
|
e308efda8e
vulkan: in flash attention, bounds check against nem1 (don't rely on GGML_KQ_MASK_PAD) (#16316)
|
3 月之前 |
Aleksander Grygier
|
136bda78c5
webui : Fix messages payload sent to chat completions (#16402)
|
3 月之前 |
Pascal
|
5113efd34c
fix: track viewportHeight via window.innerHeight to avoid unwanted scrolling (#16356)
|
3 月之前 |
Sigbjørn Skjæret
|
d64c8104f0
test-barrier : do not use more threads than physically available (#16389)
|
3 月之前 |
Reese Levine
|
ef07a40906
ggml webgpu: add support for soft_max, optimize rms_norm (#16357)
|
3 月之前 |
Piotr Wilkin (ilintar)
|
34fcc5a4ac
model : Apertus model implementation (#15852)
|
3 月之前 |
R0CKSTAR
|
91a2a56556
musa: update compile flags (#16265)
|
3 月之前 |
Sigbjørn Skjæret
|
72ee736c44
ci : fix ubuntu-latest-cmake-rpc (disable ccache) (#16388)
|
3 月之前 |
Eve
|
f09aefaa84
ci: update vulkan ci (#16294)
|
3 月之前 |
Georgi Gerganov
|
bbd32bc038
ci : fix clean-up of old logs (#16381)
|
3 月之前 |
Neo Zhang Jianyu
|
2be72c2b12
SYCL: Update to oneAPI 2025.2 (#16371)
|
3 月之前 |