matteo
|
caf5681fcb
server : support jinja extra template kwargs (Qwen3 enable_thinking feature), from command line and from client (#13196)
|
6 месяцев назад |
Nigel Bosch
|
1b809cee22
server : move no API key doc to /health (#14352)
|
6 месяцев назад |
aa956
|
d67341dc18
server : add server parameters for draft model cache type (#13782)
|
7 месяцев назад |
Olivier Chafik
|
e121edc432
`server`: add `--reasoning-budget 0` to disable thinking (incl. qwen3 w/ enable_thinking:false) (#13771)
|
7 месяцев назад |
Isaac McFadyen
|
6a2bc8bfb7
server : added --no-prefill-assistant flag (#13608)
|
8 месяцев назад |
Georgi Gerganov
|
053174436f
server : passthrough the /models endpoint during loading (#13535)
|
8 месяцев назад |
Xuan-Son Nguyen
|
3b24d26c22
server : update docs (#13432)
|
8 месяцев назад |
Xuan-Son Nguyen
|
33eff40240
server : vision support via libmtmd (#12898)
|
8 месяцев назад |
Diego Devesa
|
1d36b3670b
llama : move end-user examples to tools directory (#13249)
|
8 месяцев назад |