Xuan-Son Nguyen
|
0c3b7a9efe
model: fix qwen3next broken due to #18683 (#18762)
|
2 weeks ago |
Ruben Ortlam
|
0e76501e1d
Vulkan: Optimize Matmul parameters for AMD GPUs with Coopmat support (#18749)
|
2 weeks ago |
Xuan-Son Nguyen
|
4b060bf240
security: make it clear about subtopics in server (#18754)
|
2 weeks ago |
Daniel Bevenius
|
9789e28459
debug : include LLAMA_POOLING_TYPE_UNSPECIFIED in pooling check (#18692)
|
2 weeks ago |
Georgi Gerganov
|
84ae04f163
tests : refactor test-backend-sampler (#18753)
|
2 weeks ago |
Xuan-Son Nguyen
|
506bb6e010
model: try to improve Qwen3 Next (#18683)
|
2 weeks ago |
thom-dev-fr
|
79456a690a
readme : update UIs (#18751)
|
2 weeks ago |
Xuan-Son Nguyen
|
28068af789
security: narrow down the scope of what we consider a vulnerability (#18752)
|
2 weeks ago |
shaofeiqi
|
707cbafcaa
opencl: add SOFTPLUS op support (#18726)
|
2 weeks ago |
Aman Gupta
|
b137718878
test-backend-ops: fix mxfp4 tests on blackwell (#18736)
|
2 weeks ago |
Johannes Gäßler
|
d2ff4e23ac
HIP: adjust RDNA3.5 MMQ kernel selction logic (#18666)
|
2 weeks ago |
Perry Naseck
|
657a2e644b
cmake : update blas logic (#18205)
|
2 weeks ago |
Georgi Gerganov
|
f307926482
server : adjust unified KV cache tests (#18716)
|
2 weeks ago |
Sigbjørn Skjæret
|
7fdc8c893d
scripts : follow api redirects in pr2wt.sh (#18739)
|
2 weeks ago |
Xuan-Son Nguyen
|
23f82f2420
preset: allow named remote preset (#18728)
|
2 weeks ago |
Aaron Teo
|
2656c0d265
docs(ggml): update backend ops (#18734)
|
2 weeks ago |
Michael Wand
|
600a366478
Corrected: changed s13 = src1->nb[3] instead of nb[2] (#18724)
|
2 weeks ago |
Adrien Gallouët
|
ea23c15990
common : add --license to display embedded licenses (#18696)
|
2 weeks ago |
Xuan-Son Nguyen
|
9ac2693a30
server: fix n_cmpl not skipping processing prompt (#18663)
|
3 weeks ago |
Simranjeet Singh
|
a61c8bc3bf
mtmd: Add Gemma3n multimodal support with MobileNetV5 vision encoder (#18256)
|
3 weeks ago |
shaofeiqi
|
593da7fa49
opencl: add EXPM1 op (#18704)
|
3 weeks ago |
Reese Levine
|
9e41884dce
Updates to webgpu get_memory (#18707)
|
3 weeks ago |
Pascal
|
ec8fd7876b
Webui/file upload (#18694)
|
3 weeks ago |
Asbjørn Olling
|
a180ba78c7
cmake: only build cli when server is enabled (#18670)
|
3 weeks ago |
Georgi Gerganov
|
53eb9435da
server : fix timing of prompt/generation (#18713)
|
3 weeks ago |
Georgi Gerganov
|
d3435efc8a
scripts : pr2wt.sh reset to remote head (#18695)
|
3 weeks ago |
Georgi Gerganov
|
f5f8812f7c
server : use different seeds for child completions (#18700)
|
3 weeks ago |
Xuan-Son Nguyen
|
8ece3836b4
common: support remote preset (#18520)
|
3 weeks ago |
Aaron Teo
|
046d5fd44e
llama: use host memory if device reports 0 memory (#18587)
|
3 weeks ago |
Masashi Yoshimura
|
480160d472
ggml-webgpu: Fix GGML_MEM_ALIGN to 8 for emscripten. (#18628)
|
3 weeks ago |