Jeff Bolz
|
8e2da778da
vulkan: change memory_logger to be controlled by an env var (#18769)
|
2 недель назад |
Xuan-Son Nguyen
|
ce3bf9b1a4
server: update docs for sleeping [no ci] (#18777)
|
2 недель назад |
Jeff Bolz
|
2bbe4c2cf8
vulkan: Use VK_EXT_shader_64bit_indexing to handle large mat_mul(_id) (#18678)
|
2 недель назад |
Ruben Ortlam
|
1051ecd289
vulkan: Disable large coopmat matmul configuration on proprietary AMD driver (#18763)
|
2 недель назад |
Xuan-Son Nguyen
|
0c3b7a9efe
model: fix qwen3next broken due to #18683 (#18762)
|
2 недель назад |
Ruben Ortlam
|
0e76501e1d
Vulkan: Optimize Matmul parameters for AMD GPUs with Coopmat support (#18749)
|
3 недель назад |
Xuan-Son Nguyen
|
4b060bf240
security: make it clear about subtopics in server (#18754)
|
3 недель назад |
Daniel Bevenius
|
9789e28459
debug : include LLAMA_POOLING_TYPE_UNSPECIFIED in pooling check (#18692)
|
3 недель назад |
Georgi Gerganov
|
84ae04f163
tests : refactor test-backend-sampler (#18753)
|
3 недель назад |
Xuan-Son Nguyen
|
506bb6e010
model: try to improve Qwen3 Next (#18683)
|
3 недель назад |
thom-dev-fr
|
79456a690a
readme : update UIs (#18751)
|
3 недель назад |
Xuan-Son Nguyen
|
28068af789
security: narrow down the scope of what we consider a vulnerability (#18752)
|
3 недель назад |
shaofeiqi
|
707cbafcaa
opencl: add SOFTPLUS op support (#18726)
|
3 недель назад |
Aman Gupta
|
b137718878
test-backend-ops: fix mxfp4 tests on blackwell (#18736)
|
3 недель назад |
Johannes Gäßler
|
d2ff4e23ac
HIP: adjust RDNA3.5 MMQ kernel selction logic (#18666)
|
3 недель назад |
Perry Naseck
|
657a2e644b
cmake : update blas logic (#18205)
|
3 недель назад |
Georgi Gerganov
|
f307926482
server : adjust unified KV cache tests (#18716)
|
3 недель назад |
Sigbjørn Skjæret
|
7fdc8c893d
scripts : follow api redirects in pr2wt.sh (#18739)
|
3 недель назад |
Xuan-Son Nguyen
|
23f82f2420
preset: allow named remote preset (#18728)
|
3 недель назад |
Aaron Teo
|
2656c0d265
docs(ggml): update backend ops (#18734)
|
3 недель назад |
Michael Wand
|
600a366478
Corrected: changed s13 = src1->nb[3] instead of nb[2] (#18724)
|
3 недель назад |
Adrien Gallouët
|
ea23c15990
common : add --license to display embedded licenses (#18696)
|
3 недель назад |
Xuan-Son Nguyen
|
9ac2693a30
server: fix n_cmpl not skipping processing prompt (#18663)
|
3 недель назад |
Simranjeet Singh
|
a61c8bc3bf
mtmd: Add Gemma3n multimodal support with MobileNetV5 vision encoder (#18256)
|
3 недель назад |
shaofeiqi
|
593da7fa49
opencl: add EXPM1 op (#18704)
|
3 недель назад |
Reese Levine
|
9e41884dce
Updates to webgpu get_memory (#18707)
|
3 недель назад |
Pascal
|
ec8fd7876b
Webui/file upload (#18694)
|
3 недель назад |
Asbjørn Olling
|
a180ba78c7
cmake: only build cli when server is enabled (#18670)
|
3 недель назад |
Georgi Gerganov
|
53eb9435da
server : fix timing of prompt/generation (#18713)
|
3 недель назад |
Georgi Gerganov
|
d3435efc8a
scripts : pr2wt.sh reset to remote head (#18695)
|
3 недель назад |