Georgi Gerganov
|
554fd578a5
server : fix mtmd checkpoints (#16591)
|
3 달 전 |
Yann Follet
|
31d0ff1869
server / ranking : add sorting and management of top_n (#16403)
|
3 달 전 |
Georgi Gerganov
|
d00cbea63c
server : host-memory prompt caching (#16391)
|
3 달 전 |
Douglas Hanley
|
b5bd037832
llama : add support for qwen3 reranker (#15824)
|
3 달 전 |
Benni
|
459c0c2c1a
server: fix SSE and OpenAI compatibility for error messages when streaming (#16109)
|
3 달 전 |
Gabe Goodhart
|
fd621880f3
aLoRA Support (#15327)
|
4 달 전 |
Gabe Goodhart
|
5fac79cbc7
Thinking model disabled assistant prefill (#15404)
|
4 달 전 |
65a
|
4afb0a746f
server : Support multimodal completion and embeddings prompts in JSON format (#15108)
|
4 달 전 |
Johannes Gäßler
|
494c5899cb
scripts: benchmark for HTTP server throughput (#14668)
|
6 달 전 |
Sigbjørn Skjæret
|
ddef99522d
server : fix assistant prefilling when content is an array (#14360)
|
6 달 전 |
matteo
|
caf5681fcb
server : support jinja extra template kwargs (Qwen3 enable_thinking feature), from command line and from client (#13196)
|
6 달 전 |
Sigbjørn Skjæret
|
88fc854b4b
llama : improve sep token handling (#14272)
|
7 달 전 |
Georgi Gerganov
|
53f925074d
sync : vendor (#13901)
|
7 달 전 |
Xuan-Son Nguyen
|
10961339b2
mtmd : move helpers to dedicated library (⚠️ breaking change) (#13866)
|
7 달 전 |
Đinh Trọng Huy
|
e0e3aa231d
llama : add support for BertForSequenceClassification reranker (#13858)
|
7 달 전 |
Sky
|
c962ae3382
server: fix remove 'image_url'/'input_audio' json-object effectlly for 'llama_params' in multimodal-model-mode (#13853)
|
7 달 전 |
Olivier Chafik
|
03f582ae8f
server: fix streaming crashes (#13786)
|
7 달 전 |
Olivier Chafik
|
e121edc432
`server`: add `--reasoning-budget 0` to disable thinking (incl. qwen3 w/ enable_thinking:false) (#13771)
|
7 달 전 |
Olivier Chafik
|
d785f9c1fd
server: fix/test add_generation_prompt (#13770)
|
7 달 전 |
Olivier Chafik
|
f5cd27b71d
`server`: streaming of tool calls and thoughts when `--jinja` is on (#12379)
|
7 달 전 |
Xuan-Son Nguyen
|
9ecf3e66a3
server : support audio input (#13714)
|
7 달 전 |
Xuan-Son Nguyen
|
797990c4bc
mtmd : add ultravox audio input (#13623)
|
7 달 전 |
Isaac McFadyen
|
6a2bc8bfb7
server : added --no-prefill-assistant flag (#13608)
|
8 달 전 |
Piotr Wilkin (ilintar)
|
c753d7bed0
server : proper error handling for missing elements in messages array (OpenAI compatible backend) (#13540)
|
8 달 전 |
Xuan-Son Nguyen
|
360a9c98e1
server : fix cache_tokens bug with no cache_prompt (#13533)
|
8 달 전 |
Anudit Nagar
|
91159ee9df
server : allow content to be null in oaicompat_completion_params_parse (#13477)
|
8 달 전 |
Xuan-Son Nguyen
|
33eff40240
server : vision support via libmtmd (#12898)
|
8 달 전 |
Diego Devesa
|
1d36b3670b
llama : move end-user examples to tools directory (#13249)
|
8 달 전 |