Xuan-Son Nguyen
|
53d0a12658
server : allow specifying reasoning_format in HTTP request (#15238)
|
5 mēneši atpakaļ |
Lukas Straub
|
a9f77a8be3
server : add openai-style logit_bias support (#14946)
|
5 mēneši atpakaļ |
Daniel Bevenius
|
41e78c567e
server : add support for `embd_normalize` parameter (#14964)
|
5 mēneši atpakaļ |
IsaacDynamo
|
b4efd77f8a
server : add parse_special option to /tokenize endpoint (#14783)
|
5 mēneši atpakaļ |
Johannes Gäßler
|
5cae766541
scripts: synthetic prompt mode for server-bench.py (#14695)
|
6 mēneši atpakaļ |
matteo
|
caf5681fcb
server : support jinja extra template kwargs (Qwen3 enable_thinking feature), from command line and from client (#13196)
|
6 mēneši atpakaļ |
Nigel Bosch
|
1b809cee22
server : move no API key doc to /health (#14352)
|
6 mēneši atpakaļ |
aa956
|
d67341dc18
server : add server parameters for draft model cache type (#13782)
|
7 mēneši atpakaļ |
Olivier Chafik
|
e121edc432
`server`: add `--reasoning-budget 0` to disable thinking (incl. qwen3 w/ enable_thinking:false) (#13771)
|
7 mēneši atpakaļ |
Isaac McFadyen
|
6a2bc8bfb7
server : added --no-prefill-assistant flag (#13608)
|
8 mēneši atpakaļ |
Georgi Gerganov
|
053174436f
server : passthrough the /models endpoint during loading (#13535)
|
8 mēneši atpakaļ |
Xuan-Son Nguyen
|
3b24d26c22
server : update docs (#13432)
|
8 mēneši atpakaļ |
Xuan-Son Nguyen
|
33eff40240
server : vision support via libmtmd (#12898)
|
8 mēneši atpakaļ |
Diego Devesa
|
1d36b3670b
llama : move end-user examples to tools directory (#13249)
|
8 mēneši atpakaļ |