Sigbjørn Skjæret
|
f324a3b715
chat : only remove double bos/eos if added (#15086)
|
5 месяцев назад |
Jhen-Jie Hong
|
f738989dcb
chat : fix multiple tool_calls on hermes-2-pro (#14962)
|
5 месяцев назад |
kallewoof
|
1a67fcc306
common : avoid logging partial messages (which can contain broken UTF-8 sequences) (#14937)
|
5 месяцев назад |
matteo
|
caf5681fcb
server : support jinja extra template kwargs (Qwen3 enable_thinking feature), from command line and from client (#13196)
|
6 месяцев назад |
Sigbjørn Skjæret
|
e434e69183
common : suggest --jinja when autodetection fails (#14222)
|
7 месяцев назад |
Piotr
|
3cb203c89f
llama-chat : Do not throw when tool parsing fails (#14012)
|
7 месяцев назад |
Olivier Chafik
|
c9bbc77931
`server`: update deepseek reasoning format (pass reasoning_content as diffs) (#13933)
|
7 месяцев назад |
Georgi Gerganov
|
53f925074d
sync : vendor (#13901)
|
7 месяцев назад |
Olivier Chafik
|
03f582ae8f
server: fix streaming crashes (#13786)
|
7 месяцев назад |
Olivier Chafik
|
d74e94c1b3
`server`: fix format of streamed tool call deltas (diff name, fix id location) (#13800)
|
7 месяцев назад |
Olivier Chafik
|
f13847cfb5
server: fix regression on streamed non-chat completion w/ stops (#13785)
|
7 месяцев назад |
Olivier Chafik
|
e121edc432
`server`: add `--reasoning-budget 0` to disable thinking (incl. qwen3 w/ enable_thinking:false) (#13771)
|
7 месяцев назад |
Olivier Chafik
|
f5cd27b71d
`server`: streaming of tool calls and thoughts when `--jinja` is on (#12379)
|
7 месяцев назад |
Olivier Chafik
|
aa48e373f2
`server`: inject date_string in llama 3.x template + fix date for firefunction v2 (#12802)
|
8 месяцев назад |
Xuan-Son Nguyen
|
8c83449cb7
server : (webui) revamp the input area, plus many small UI improvements (#13365)
|
8 месяцев назад |
Olivier Chafik
|
b6930ebc42
`tool-call`: fix non-tool-calling grammar crashes w/ Qwen / Hermes 2 templates (#12900)
|
9 месяцев назад |
Olivier Chafik
|
4e39a3c332
`server`: extract <think> tags from qwq outputs (#12297)
|
10 месяцев назад |
Olivier Chafik
|
87c2630546
allow missing content in message if tool_calls provided (#12293)
|
10 месяцев назад |
Olivier Chafik
|
669912d9a5
`tool-call`: fix Qwen 2.5 Coder support, add micro benchmarks, support trigger patterns for lazy grammars (#12034)
|
10 месяцев назад |
Olivier Chafik
|
63e489c025
tool-call: refactor common chat / tool-call api (+ tests / fixes) (#11900)
|
11 месяцев назад |
Olivier Chafik
|
f355229692
server: fix type promotion typo causing crashes w/ --jinja w/o tools (#11880)
|
11 месяцев назад |
Olivier Chafik
|
c7f460ab88
`server`: fix tool-call of DeepSeek R1 Qwen, return reasoning_content (Command 7RB & DeepSeek R1) unless `--reasoning-format none` (#11607)
|
11 месяцев назад |
Olivier Chafik
|
9f4cc8f8d3
`sync`: minja (#11641)
|
11 месяцев назад |
Olivier Chafik
|
db288b60cb
`tool-call`: command r7b fix for normal responses (#11608)
|
11 месяцев назад |
Olivier Chafik
|
bfcce4d693
`tool-call`: support Command R7B (+ return tool_plan "thoughts" in API) (#11585)
|
11 месяцев назад |
Olivier Chafik
|
a83f528688
`tool-call`: fix llama 3.x and functionary 3.2, play nice w/ pydantic_ai package, update readme (#11539)
|
11 месяцев назад |
Olivier Chafik
|
8b576b6c55
Tool call support (generic + native for Llama, Functionary, Hermes, Mistral, Firefunction, DeepSeek) w/ lazy grammars (#9639)
|
11 месяцев назад |