Gabe Goodhart
|
e1fa9569ba
server : add SSL support (#5926)
|
1 rok temu |
Georgi Gerganov
|
2002bc96bf
server : refactor (#5882)
|
1 rok temu |
Pierrick Hymbert
|
8ef969afce
server : init http requests thread pool with --parallel if set (#5836)
|
1 rok temu |
Georgi Gerganov
|
38d16b1426
server : remove api_like_OAI.py proxy script (#5808)
|
1 rok temu |
Pierrick Hymbert
|
5cb02b4a01
server: allow to override threads server pool with --threads-http (#5794)
|
1 rok temu |
Pierrick Hymbert
|
8b350356b2
server: docs - refresh and tease a little bit more the http server (#5718)
|
1 rok temu |
Pierrick Hymbert
|
930b178026
server: logs - unified format and --log-format option (#5700)
|
1 rok temu |
Pierrick Hymbert
|
d52d7819b8
server: concurrency fix + monitoring - add /metrics prometheus compatible endpoint (#5708)
|
1 rok temu |
Pierrick Hymbert
|
525213d2f5
server: init functional tests (#5566)
|
1 rok temu |
Alexey Parfenov
|
c5688c6250
server : clarify some params in the docs (#5640)
|
1 rok temu |
Xuan Son Nguyen
|
7c8bcc11dc
Add docs for llama_chat_apply_template (#5645)
|
1 rok temu |
Pierrick Hymbert
|
1ecea255eb
server: health: fix race condition on slots data using tasks queue (#5634)
|
1 rok temu |
Pierrick Hymbert
|
c0a8c6db37
server : health endpoint configurable failure on no slot (#5594)
|
1 rok temu |
Robey Holderith
|
5ee99c32f5
common, server : surface min_keep as its own parameter (#5567)
|
1 rok temu |
Pierrick Hymbert
|
c145f8a132
server : slots monitoring endpoint (#5550)
|
1 rok temu |
Pierrick Hymbert
|
e75c6279d1
server : enhanced health endpoint (#5548)
|
1 rok temu |
Pierrick Hymbert
|
36376abe05
server : --n-predict option document and cap to max value (#5549)
|
1 rok temu |
Alexey Parfenov
|
6dcc02d244
server : add "samplers" param to control the samplers order (#5494)
|
1 rok temu |
bmwl
|
f486f6e1e5
ggml : add numa options (#5377)
|
1 rok temu |
Alexey Parfenov
|
684780141a
server : allow to specify tokens as strings in logit_bias (#5003)
|
1 rok temu |
Justin Parker
|
f3e2b4fa3f
server : update `/props` with "total_slots" value (#5373)
|
1 rok temu |
Michael Coppola
|
31e7903221
server : add `dynatemp_range` and `dynatemp_exponent` (#5352)
|
1 rok temu |
Alexey Parfenov
|
a2d60c9158
server : allow to get default generation settings for completion (#5307)
|
1 rok temu |
Wu Jian Ping
|
6685cc41c2
server : improve README (#5209)
|
1 rok temu |
Kyle Mistele
|
39baaf55a1
docker : add server-first container images (#5157)
|
2 lat temu |
Maximilian Winter
|
ec903c0341
server : add self-extend support (#5104)
|
2 lat temu |
Michael Coppola
|
27379455c3
server : support for multiple api keys (#4864)
|
2 lat temu |
Behnam M
|
7a9f75c38b
server : update readme to document the new `/health` endpoint (#4866)
|
2 lat temu |
Behnam M
|
128de3585b
server : update readme about token probs (#4777)
|
2 lat temu |
Zsapi
|
8c58330318
server : add api-key flag to documentation (#4832)
|
2 lat temu |