Eve
|
cb999704fb
vulkan: small dequantization improvements (#18380)
|
vor 1 Monat |
Jeff Bolz
|
b96b82fc85
vulkan: Support UPSCALE w/antialias (#18327)
|
vor 1 Monat |
Jeff Bolz
|
10dc500bdb
vulkan: handle rope with large number of rows (#18306)
|
vor 1 Monat |
o7si
|
4893cc07bb
server : fix crash when seq_rm fails for hybrid/recurrent models (#18391)
|
vor 1 Monat |
Francisco Herrera
|
af3be131c0
docs: added note for pre SYCL Intel hardware (#18016)
|
vor 1 Monat |
0Marble
|
b07cda687c
CANN: implement the SSM_CONV operator (#17737)
|
vor 1 Monat |
Aman Gupta
|
85c40c9b02
ggml-cuda: fix regex for arch list (#18371)
|
vor 1 Monat |
Aman Gupta
|
83b3b1c271
cuda: optimize cumsum cub path (#18362)
|
vor 1 Monat |
Aman Gupta
|
b0fb0f0aee
ggml-cuda: fix blackwell native builds (#18361)
|
vor 1 Monat |
Penglin Cai
|
e68c19b0fd
CANN: Add support for CONV_TRANSPOSE_1D when kernel size > 255 (#17934)
|
vor 1 Monat |
Aadeshveer Singh
|
c54bba869d
ggml : optimize cuda cumsum fallback kernel (#18343)
|
vor 1 Monat |
Xuan-Son Nguyen
|
f5acfb2ffa
server: (router) add stop-timeout option (#18350)
|
vor 1 Monat |
Xuan-Son Nguyen
|
4cbafad4f0
model: support MiMo-V2-Flash (#18328)
|
vor 1 Monat |
Aadeshveer Singh
|
c184284230
fit-params : fix race condition in fit-params output (#18276)
|
vor 1 Monat |
Aman Gupta
|
c8a2417d7b
CUDA: experimental native mxfp4 support for blackwell (#17906)
|
vor 1 Monat |
Saba Fallah
|
54132f1b1f
model : support for LlamaBidirectionalModel architecture (#18220)
|
vor 1 Monat |
Jeff Bolz
|
2a9ea2020c
vulkan: fix command buffer corruption in ggml_backend_vk_event_wait (#18302)
|
vor 1 Monat |
Wang Weixuan
|
ce7a6dc0fc
CANN : refactor ACL graph cache (#17752)
|
vor 1 Monat |
Jesse Ikonen
|
1ce0126b18
docs: Fix typos in SYCL documentation (#18269)
|
vor 1 Monat |
Ruben Ortlam
|
7f459c98e7
vulkan: use fewer FA rows for small cache runs (#18280)
|
vor 1 Monat |
TianHao324
|
cf2ffc02bc
CANN: Uses yarn_ramp cache in ROPE (#17725)
|
vor 1 Monat |
ddh0
|
10355dc7d0
common: add `LLAMA_ARG_OVERRIDE_TENSOR` env var for `-ot` arg (#18267)
|
vor 1 Monat |
Xuan-Son Nguyen
|
5ee4e43f26
server: return_progress to also report 0% processing state (#18305)
|
vor 1 Monat |
Pascal
|
5b6c9bc0f3
webui: apply webui_settings on first load (#18223)
|
vor 1 Monat |
Xuan-Son Nguyen
|
849d021104
server: fix crash with model not having BOS/EOS (#18321)
|
vor 1 Monat |
Daniel Bevenius
|
8e3ead6e4d
model-conversion : add device option to run-org-model.py (#18318)
|
vor 1 Monat |
Chris Rohlf
|
12ee1763a6
rpc : add check for rpc buffer type (#18242)
|
vor 1 Monat |
nullname
|
ed75977717
ggml-hexagon: create generalized functions for cpu side op (#17500)
|
vor 1 Monat |
Daniel Bevenius
|
847c35f7d5
model-conversion : add trust_remote_code for embedding scripts (#18288)
|
vor 1 Monat |
Neo Zhang
|
a6a552e4ec
[SYCL] replace llama-cli by llama-completion to rm the impact to test script (#18290)
|
vor 1 Monat |