Commit History

作者 SHA1 備註 提交日期
  Pascal e7c2cf1356 server: add router multi-model tests (#17704) (#17722) 1 月之前
  Xuan-Son Nguyen 13628d8bdb server: add --media-path for local media files (#17697) 1 月之前
  Xuan-Son Nguyen ec18edfcba server: introduce API for serving / loading / unloading multiple models (#17470) 1 月之前
  Xuan-Son Nguyen e509411cf1 server: enable jinja by default, update docs (#17524) 1 月之前
  Georgi Gerganov cd5e3b5754 server : support unified cache across slots (#16736) 2 月之前
  Radoslav Gerganov 68ee98ae18 server : return HTTP 400 if prompt exceeds context length (#16486) 3 月之前
  Daniel Bevenius d0991da39d server : add support for external server for tests (#16243) 3 月之前
  Xuan-Son Nguyen 3c3635d2f2 server : speed up tests (#15836) 4 月之前
  Georgi Gerganov 0d161f021a server : enable /slots by default and make it secure (#15630) 4 月之前
  Johannes Gäßler e81b8e4b7f llama: use FA + max. GPU layers by default (#15434) 4 月之前
  Johannes Gäßler fbef0fad7a server: higher timeout for tests (#15621) 4 月之前
  teo 1bc664a26a server: fix OpenAI API compatibility for usage statistics in chat streams (#15444) 4 月之前
  Georgi Gerganov d2fcd91cf9 server : disable context shift by default (#15416) 5 月之前
  Olivier Chafik c9bbc77931 `server`: update deepseek reasoning format (pass reasoning_content as diffs) (#13933) 7 月之前
  Olivier Chafik d74e94c1b3 `server`: fix format of streamed tool call deltas (diff name, fix id location) (#13800) 7 月之前
  Olivier Chafik e121edc432 `server`: add `--reasoning-budget 0` to disable thinking (incl. qwen3 w/ enable_thinking:false) (#13771) 7 月之前
  Olivier Chafik f5cd27b71d `server`: streaming of tool calls and thoughts when `--jinja` is on (#12379) 7 月之前
  Xuan-Son Nguyen 33eff40240 server : vision support via libmtmd (#12898) 8 月之前
  Diego Devesa 1d36b3670b llama : move end-user examples to tools directory (#13249) 8 月之前