Isaac McFadyen
|
e0539eb6ae
webui: switch to hash-based routing (alternative of #16079) (#16157)
|
3 месяцев назад |
Douglas Hanley
|
b5bd037832
llama : add support for qwen3 reranker (#15824)
|
3 месяцев назад |
Benni
|
459c0c2c1a
server: fix SSE and OpenAI compatibility for error messages when streaming (#16109)
|
3 месяцев назад |
Radoslav Gerganov
|
2b6b55a59f
server : include usage statistics only when user request them (#16052)
|
4 месяцев назад |
Aleksander Grygier
|
a7a98e0fff
SvelteKit-based WebUI (#14839)
|
4 месяцев назад |
Sigbjørn Skjæret
|
6c019cb04e
server : only attempt to enable thinking if using jinja (#15967)
|
4 месяцев назад |
Georgi Gerganov
|
f088b6a84f
server : adjust prompt similarity thold + add logs (#15913)
|
4 месяцев назад |
Xuan-Son Nguyen
|
56920f5665
server : bring back timings_per_token (#15879)
|
4 месяцев назад |
Xuan-Son Nguyen
|
61bdfd5298
server : implement prompt processing progress report in stream mode (#15827)
|
4 месяцев назад |
Gabe Goodhart
|
fd621880f3
aLoRA Support (#15327)
|
4 месяцев назад |
Gabe Goodhart
|
5fac79cbc7
Thinking model disabled assistant prefill (#15404)
|
4 месяцев назад |
Xuan-Son Nguyen
|
a68d914426
server: add exceed_context_size_error type (#15780)
|
4 месяцев назад |
Georgi Gerganov
|
e92d53b29e
sampling : optimize samplers by reusing bucket sort (#15665)
|
4 месяцев назад |
Georgi Gerganov
|
0d161f021a
server : enable /slots by default and make it secure (#15630)
|
4 месяцев назад |
Sigbjørn Skjæret
|
84ab83cc0b
model : jina-embeddings-v3 support (#13693)
|
4 месяцев назад |
65a
|
4afb0a746f
server : Support multimodal completion and embeddings prompts in JSON format (#15108)
|
4 месяцев назад |
teo
|
1bc664a26a
server: fix OpenAI API compatibility for usage statistics in chat streams (#15444)
|
4 месяцев назад |
davidef
|
d1d8241600
server : fix incoming tasks not process in order (#15395)
|
5 месяцев назад |
Oleksandr Kuvshynov
|
e5155e6986
server : export max observed n_past value (#15361)
|
5 месяцев назад |
Diego Devesa
|
f75b830647
chat : include kwargs in template example (#15309)
|
5 месяцев назад |
Georgi Gerganov
|
d32e03f449
server : add SWA checkpoints (#15293)
|
5 месяцев назад |
Sigbjørn Skjæret
|
b3e16665e1
server : enable -td and -tbd parameters (#15172)
|
5 месяцев назад |
Copilot
|
d8914fc47e
common : add --override-tensor-draft, --cpu-moe-draft and --n-cpu-moe-draft parameters (#15191)
|
5 месяцев назад |
Xuan-Son Nguyen
|
53d0a12658
server : allow specifying reasoning_format in HTTP request (#15238)
|
5 месяцев назад |
Johannes Gäßler
|
f906275537
server: enable token array inputs for OAI API (#15001)
|
5 месяцев назад |
g2mt
|
94933c8c2e
server : implement universal assisted decoding (#12635)
|
5 месяцев назад |
Lukas Straub
|
a9f77a8be3
server : add openai-style logit_bias support (#14946)
|
5 месяцев назад |
Daniel Bevenius
|
41e78c567e
server : add support for `embd_normalize` parameter (#14964)
|
5 месяцев назад |
Molly Sophia
|
adef81781a
server : allow setting `--reverse-prompt` arg (#14799)
|
5 месяцев назад |
IsaacDynamo
|
b4efd77f8a
server : add parse_special option to /tokenize endpoint (#14783)
|
6 месяцев назад |