Histórico de Commits

Autor SHA1 Mensagem Data
  Georgi Gerganov 554fd578a5 server : fix mtmd checkpoints (#16591) há 3 meses atrás
  Yann Follet 31d0ff1869 server / ranking : add sorting and management of top_n (#16403) há 3 meses atrás
  Georgi Gerganov d00cbea63c server : host-memory prompt caching (#16391) há 3 meses atrás
  Douglas Hanley b5bd037832 llama : add support for qwen3 reranker (#15824) há 3 meses atrás
  Benni 459c0c2c1a server: fix SSE and OpenAI compatibility for error messages when streaming (#16109) há 3 meses atrás
  Gabe Goodhart fd621880f3 aLoRA Support (#15327) há 4 meses atrás
  Gabe Goodhart 5fac79cbc7 Thinking model disabled assistant prefill (#15404) há 4 meses atrás
  65a 4afb0a746f server : Support multimodal completion and embeddings prompts in JSON format (#15108) há 4 meses atrás
  Johannes Gäßler 494c5899cb scripts: benchmark for HTTP server throughput (#14668) há 6 meses atrás
  Sigbjørn Skjæret ddef99522d server : fix assistant prefilling when content is an array (#14360) há 6 meses atrás
  matteo caf5681fcb server : support jinja extra template kwargs (Qwen3 enable_thinking feature), from command line and from client (#13196) há 6 meses atrás
  Sigbjørn Skjæret 88fc854b4b llama : improve sep token handling (#14272) há 7 meses atrás
  Georgi Gerganov 53f925074d sync : vendor (#13901) há 7 meses atrás
  Xuan-Son Nguyen 10961339b2 mtmd : move helpers to dedicated library (⚠️ breaking change) (#13866) há 7 meses atrás
  Đinh Trọng Huy e0e3aa231d llama : add support for BertForSequenceClassification reranker (#13858) há 7 meses atrás
  Sky c962ae3382 server: fix remove 'image_url'/'input_audio' json-object effectlly for 'llama_params' in multimodal-model-mode (#13853) há 7 meses atrás
  Olivier Chafik 03f582ae8f server: fix streaming crashes (#13786) há 7 meses atrás
  Olivier Chafik e121edc432 `server`: add `--reasoning-budget 0` to disable thinking (incl. qwen3 w/ enable_thinking:false) (#13771) há 7 meses atrás
  Olivier Chafik d785f9c1fd server: fix/test add_generation_prompt (#13770) há 7 meses atrás
  Olivier Chafik f5cd27b71d `server`: streaming of tool calls and thoughts when `--jinja` is on (#12379) há 7 meses atrás
  Xuan-Son Nguyen 9ecf3e66a3 server : support audio input (#13714) há 7 meses atrás
  Xuan-Son Nguyen 797990c4bc mtmd : add ultravox audio input (#13623) há 7 meses atrás
  Isaac McFadyen 6a2bc8bfb7 server : added --no-prefill-assistant flag (#13608) há 8 meses atrás
  Piotr Wilkin (ilintar) c753d7bed0 server : proper error handling for missing elements in messages array (OpenAI compatible backend) (#13540) há 8 meses atrás
  Xuan-Son Nguyen 360a9c98e1 server : fix cache_tokens bug with no cache_prompt (#13533) há 8 meses atrás
  Anudit Nagar 91159ee9df server : allow content to be null in oaicompat_completion_params_parse (#13477) há 8 meses atrás
  Xuan-Son Nguyen 33eff40240 server : vision support via libmtmd (#12898) há 8 meses atrás
  Diego Devesa 1d36b3670b llama : move end-user examples to tools directory (#13249) há 8 meses atrás