Xuan-Son Nguyen
|
35370ba945
server : use std::move whenever possible (#12936)
|
9 maanden geleden |
Georgi Gerganov
|
c94085df28
server : add VSCode's Github Copilot Chat support (#12896)
|
9 maanden geleden |
Xuan-Son Nguyen
|
78a1ba0a4f
server : fix thread.join() on exit (#12831)
|
9 maanden geleden |
Xuan-Son Nguyen
|
42eb248f46
common : remove json.hpp from common.cpp (#12697)
|
9 maanden geleden |
Xuan-Son Nguyen
|
267c1399f1
common : refactor downloading system, handle mmproj with -hf option (#12694)
|
9 maanden geleden |
Benson Wong
|
5d01670266
server : include speculative decoding stats when timings_per_token is enabled (#12603)
|
9 maanden geleden |
Piotr
|
2099a9d5db
server : Support listening on a unix socket (#12613)
|
9 maanden geleden |
Marius Gerdes
|
77f9c6bbe5
server : Add verbose output to OAI compatible chat endpoint. (#12246)
|
9 maanden geleden |
Georgi Gerganov
|
810e0af3f5
server : fix warmup draft cache type (#12446)
|
10 maanden geleden |
Georgi Gerganov
|
e0dbec0bc6
llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181)
|
10 maanden geleden |
Ishaan Gandhi
|
2048b5913d
server : fix crash when using verbose output with input tokens that are not in printable range (#12178) (#12338)
|
10 maanden geleden |
Olivier Chafik
|
be421fc429
`tool-call`: ensure there's always a non-empty tool call id (#12292)
|
10 maanden geleden |
Olivier Chafik
|
2b3a25c212
`sampler`: fixes trigger tokens + lazy grammars (fix typo cast from token to string) (#12291)
|
10 maanden geleden |
Georgi Gerganov
|
7ab364390f
server : infill gen ends on new line (#12254)
|
10 maanden geleden |
Sigbjørn Skjæret
|
8fad3c7a7c
server : Log original chat template parsing error (#12233)
|
10 maanden geleden |
Olivier Chafik
|
669912d9a5
`tool-call`: fix Qwen 2.5 Coder support, add micro benchmarks, support trigger patterns for lazy grammars (#12034)
|
10 maanden geleden |
Clauszy
|
06a92a193a
server : fix cache reuse logic (#12161)
|
10 maanden geleden |
Georgi Gerganov
|
abd4d0bc4f
speculative : update default params (#11954)
|
11 maanden geleden |
Olivier Chafik
|
63e489c025
tool-call: refactor common chat / tool-call api (+ tests / fixes) (#11900)
|
11 maanden geleden |
Xuan-Son Nguyen
|
63ac128563
server : add TEI API format for /rerank endpoint (#11942)
|
11 maanden geleden |
Antoine Viallon
|
c4d29baf32
server : fix divide-by-zero in metrics reporting (#11915)
|
11 maanden geleden |
Georgi Gerganov
|
68ff663a04
repo : update links to new url (#11886)
|
11 maanden geleden |
Olivier Chafik
|
c7f460ab88
`server`: fix tool-call of DeepSeek R1 Qwen, return reasoning_content (Command 7RB & DeepSeek R1) unless `--reasoning-format none` (#11607)
|
11 maanden geleden |
Oleksandr Kuvshynov
|
e4376270d9
llama.cpp: fix warning message (#11839)
|
11 maanden geleden |
Daniel Bevenius
|
a18f481f99
server : use common_token_to_piece instead of common_detokenize (#11740)
|
11 maanden geleden |
Xuan-Son Nguyen
|
0893e0114e
server : correct signal handler (#11795)
|
11 maanden geleden |
Xuan-Son Nguyen
|
55ac8c7791
server : (webui) revamp Settings dialog, add Pyodide interpreter (#11759)
|
11 maanden geleden |
Georgi Gerganov
|
aaa5505307
server : minor log updates (#11760)
|
11 maanden geleden |
Xuan-Son Nguyen
|
3962fc1a79
server : add try..catch to places not covered by set_exception_handler (#11620)
|
11 maanden geleden |
Olivier Chafik
|
bfcce4d693
`tool-call`: support Command R7B (+ return tool_plan "thoughts" in API) (#11585)
|
11 maanden geleden |