Georgi Gerganov
|
e0dbec0bc6
llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181)
|
10 месяцев назад |
Ishaan Gandhi
|
2048b5913d
server : fix crash when using verbose output with input tokens that are not in printable range (#12178) (#12338)
|
10 месяцев назад |
Olivier Chafik
|
be421fc429
`tool-call`: ensure there's always a non-empty tool call id (#12292)
|
10 месяцев назад |
Olivier Chafik
|
2b3a25c212
`sampler`: fixes trigger tokens + lazy grammars (fix typo cast from token to string) (#12291)
|
10 месяцев назад |
Georgi Gerganov
|
7ab364390f
server : infill gen ends on new line (#12254)
|
10 месяцев назад |
Sigbjørn Skjæret
|
8fad3c7a7c
server : Log original chat template parsing error (#12233)
|
10 месяцев назад |
Olivier Chafik
|
669912d9a5
`tool-call`: fix Qwen 2.5 Coder support, add micro benchmarks, support trigger patterns for lazy grammars (#12034)
|
10 месяцев назад |
Clauszy
|
06a92a193a
server : fix cache reuse logic (#12161)
|
10 месяцев назад |
Georgi Gerganov
|
abd4d0bc4f
speculative : update default params (#11954)
|
11 месяцев назад |
Olivier Chafik
|
63e489c025
tool-call: refactor common chat / tool-call api (+ tests / fixes) (#11900)
|
11 месяцев назад |
Xuan-Son Nguyen
|
63ac128563
server : add TEI API format for /rerank endpoint (#11942)
|
11 месяцев назад |
Antoine Viallon
|
c4d29baf32
server : fix divide-by-zero in metrics reporting (#11915)
|
11 месяцев назад |
Georgi Gerganov
|
68ff663a04
repo : update links to new url (#11886)
|
11 месяцев назад |
Olivier Chafik
|
c7f460ab88
`server`: fix tool-call of DeepSeek R1 Qwen, return reasoning_content (Command 7RB & DeepSeek R1) unless `--reasoning-format none` (#11607)
|
11 месяцев назад |
Oleksandr Kuvshynov
|
e4376270d9
llama.cpp: fix warning message (#11839)
|
11 месяцев назад |
Daniel Bevenius
|
a18f481f99
server : use common_token_to_piece instead of common_detokenize (#11740)
|
11 месяцев назад |
Xuan-Son Nguyen
|
0893e0114e
server : correct signal handler (#11795)
|
11 месяцев назад |
Xuan-Son Nguyen
|
55ac8c7791
server : (webui) revamp Settings dialog, add Pyodide interpreter (#11759)
|
11 месяцев назад |
Georgi Gerganov
|
aaa5505307
server : minor log updates (#11760)
|
11 месяцев назад |
Xuan-Son Nguyen
|
3962fc1a79
server : add try..catch to places not covered by set_exception_handler (#11620)
|
11 месяцев назад |
Olivier Chafik
|
bfcce4d693
`tool-call`: support Command R7B (+ return tool_plan "thoughts" in API) (#11585)
|
11 месяцев назад |
Olivier Chafik
|
a83f528688
`tool-call`: fix llama 3.x and functionary 3.2, play nice w/ pydantic_ai package, update readme (#11539)
|
11 месяцев назад |
Olivier Chafik
|
5783575c9d
Fix chatml fallback for unsupported builtin templates (when --jinja not enabled) (#11533)
|
11 месяцев назад |
Daniel Bevenius
|
a2df2787b3
server : update help metrics processing/deferred (#11512)
|
11 месяцев назад |
Olivier Chafik
|
8b576b6c55
Tool call support (generic + native for Llama, Functionary, Hermes, Mistral, Firefunction, DeepSeek) w/ lazy grammars (#9639)
|
11 месяцев назад |
Daniel Bevenius
|
4314e56c4f
server : use lambda instead of std::bind (#11507)
|
11 месяцев назад |
Nigel Bosch
|
eb7cf15a80
server : add /apply-template endpoint for additional use cases of Minja functionality (#11489)
|
11 месяцев назад |
Daniel Bevenius
|
e51c47b401
server : update auto gen files comments [no ci] (#11484)
|
11 месяцев назад |
Xuan Son Nguyen
|
49b0e3cec4
server : fix cleaning up stream task (#11418)
|
11 месяцев назад |
Xuan Son Nguyen
|
5845661640
server : add more clean up when cancel_tasks is called (#11340)
|
11 месяцев назад |