Victor
|
add2a3aa5a
server: fix "--grammar-file" parameter (#12285)
|
10 月之前 |
Olivier Chafik
|
be421fc429
`tool-call`: ensure there's always a non-empty tool call id (#12292)
|
10 月之前 |
Olivier Chafik
|
669912d9a5
`tool-call`: fix Qwen 2.5 Coder support, add micro benchmarks, support trigger patterns for lazy grammars (#12034)
|
10 月之前 |
Olivier Chafik
|
1a24c4621f
`server`: fix deadly typo in response_format.json_schema.schema handling (#12168)
|
10 月之前 |
rhjdvsgsgks
|
401af80b54
server: handle echo=false on /v1/completions (#12060)
|
10 月之前 |
Olivier Chafik
|
0b52745649
server: support add_generation_prompt query param (#12062)
|
10 月之前 |
Georgi Gerganov
|
cf756d6e0a
server : disable Nagle's algorithm (#12020)
|
10 月之前 |
Olivier Chafik
|
63e489c025
tool-call: refactor common chat / tool-call api (+ tests / fixes) (#11900)
|
11 月之前 |
Xuan-Son Nguyen
|
63ac128563
server : add TEI API format for /rerank endpoint (#11942)
|
11 月之前 |
Georgi Gerganov
|
68ff663a04
repo : update links to new url (#11886)
|
11 月之前 |
Olivier Chafik
|
c7f460ab88
`server`: fix tool-call of DeepSeek R1 Qwen, return reasoning_content (Command 7RB & DeepSeek R1) unless `--reasoning-format none` (#11607)
|
11 月之前 |
Daniel Bevenius
|
5598f475be
server : remove CPPHTTPLIB_NO_EXCEPTIONS define (#11622)
|
11 月之前 |
Olivier Chafik
|
bfcce4d693
`tool-call`: support Command R7B (+ return tool_plan "thoughts" in API) (#11585)
|
11 月之前 |
Olivier Chafik
|
a83f528688
`tool-call`: fix llama 3.x and functionary 3.2, play nice w/ pydantic_ai package, update readme (#11539)
|
11 月之前 |
Olivier Chafik
|
b1bcd309fc
fix stop regression (#11543)
|
11 月之前 |
Olivier Chafik
|
4a2b196d03
server : fix --jinja when there's no tools or schema (typo was forcing JSON) (#11531)
|
11 月之前 |
Olivier Chafik
|
8b576b6c55
Tool call support (generic + native for Llama, Functionary, Hermes, Mistral, Firefunction, DeepSeek) w/ lazy grammars (#9639)
|
11 月之前 |
Olivier Chafik
|
6171c9d258
Add Jinja template support (#11016)
|
1 年之前 |
Georgi Gerganov
|
afa8a9ec9b
llama : add `llama_vocab`, functions -> methods, naming (#11110)
|
1 年之前 |
Georgi Gerganov
|
727368c60f
llama : use LLAMA_TOKEN_NULL (#11062)
|
1 年之前 |
Georgi Gerganov
|
f66f582927
llama : refactor `src/llama.cpp` (#10902)
|
1 年之前 |
Xuan Son Nguyen
|
0da5d86026
server : allow using LoRA adapters per-request (#10994)
|
1 年之前 |
Xuan Son Nguyen
|
45095a61bf
server : clean up built-in template detection (#11026)
|
1 年之前 |
Xuan Son Nguyen
|
5896c65232
server : add OAI compat for /v1/completions (#10974)
|
1 年之前 |
Reza Kakhki
|
9ba399dfa7
server : add support for "encoding_format": "base64" to the */embeddings endpoints (#10967)
|
1 年之前 |
NeverLucky
|
09fe2e7613
server: allow filtering llama server response fields (#10940)
|
1 年之前 |
Xuan Son Nguyen
|
485dc01214
server : add system_fingerprint to chat/completion (#10917)
|
1 年之前 |
Xuan Son Nguyen
|
57bb2c40cd
server : fix logprobs, make it OAI-compatible (#10783)
|
1 年之前 |
Xuan Son Nguyen
|
46828872c3
server : (embeddings) using same format for "input" and "content" (#10872)
|
1 年之前 |
krystiancha
|
05c3a444b8
server : fill usage info in embeddings and rerank responses (#10852)
|
1 年之前 |