Histórico de Commits

Autor SHA1 Mensagem Data
  Georgi Gerganov 190c4838bd chat : reserve memory in compute_diffs and improve naming (#17729) há 1 mês atrás
  Aldehir Rojas 0a8026e768 common : introduce composable PEG parser combinators for chat parsing (#17136) há 1 mês atrás
  hksdpc255 1920345c3b common : Generalized XML-style tool-call parsing with streaming support (GLM 4.5/4.6 + MiniMax M2 + SeedOSS + Kimi-K2 + Qwen3-Coder + Apriel-1.5 + Xiaomi-MiMo) (#16932) há 1 mês atrás
  Yuri Khrustalev c053e18a66 chat: Add LFM2 tool handling (#16763) há 2 meses atrás
  Georgi Gerganov d00cbea63c server : host-memory prompt caching (#16391) há 3 meses atrás
  Pascal 128d522c04 chat : support Magistral thinking (#16413) há 3 meses atrás
  Piotr Wilkin (ilintar) 34fcc5a4ac model : Apertus model implementation (#15852) há 3 meses atrás
  Jesse 88021565f0 chat : Deepseek V3.1 reasoning and tool calling support (OpenAI Style) (#15533) há 4 meses atrás
  Gabe Goodhart 5fac79cbc7 Thinking model disabled assistant prefill (#15404) há 4 meses atrás
  Piotr Wilkin (ilintar) b2426e469e chat : nemotron thinking & toolcalling support (#15676) há 4 meses atrás
  Piotr Wilkin (ilintar) 60e5eee31f chat : Seed OSS thinking + tool call support (#15552) há 4 meses atrás
  Diego Devesa f75b830647 chat : include kwargs in template example (#15309) há 5 meses atrás
  Xuan-Son Nguyen 53d0a12658 server : allow specifying reasoning_format in HTTP request (#15238) há 5 meses atrás
  Sachin Desai 3db4da56a5 chat : support Granite model reasoning and tool call (#14864) há 5 meses atrás
  Georgi Gerganov fd1234cb46 llama : add gpt-oss (#15091) há 5 meses atrás
  Sigbjørn Skjæret f324a3b715 chat : only remove double bos/eos if added (#15086) há 5 meses atrás
  matteo caf5681fcb server : support jinja extra template kwargs (Qwen3 enable_thinking feature), from command line and from client (#13196) há 6 meses atrás
  Olivier Chafik c9bbc77931 `server`: update deepseek reasoning format (pass reasoning_content as diffs) (#13933) há 7 meses atrás
  Olivier Chafik 03f582ae8f server: fix streaming crashes (#13786) há 7 meses atrás
  Olivier Chafik e121edc432 `server`: add `--reasoning-budget 0` to disable thinking (incl. qwen3 w/ enable_thinking:false) (#13771) há 7 meses atrás
  Olivier Chafik f5cd27b71d `server`: streaming of tool calls and thoughts when `--jinja` is on (#12379) há 7 meses atrás
  Olivier Chafik aa48e373f2 `server`: inject date_string in llama 3.x template + fix date for firefunction v2 (#12802) há 8 meses atrás
  Olivier Chafik 4e39a3c332 `server`: extract <think> tags from qwq outputs (#12297) há 10 meses atrás
  Olivier Chafik 63e489c025 tool-call: refactor common chat / tool-call api (+ tests / fixes) (#11900) há 11 meses atrás