Commit History

Autor SHA1 Mensaxe Data
  Georgi Gerganov 38d16b1426 server : remove api_like_OAI.py proxy script (#5808) hai 1 ano
  Pierrick Hymbert 5cb02b4a01 server: allow to override threads server pool with --threads-http (#5794) hai 1 ano
  Pierrick Hymbert 8b350356b2 server: docs - refresh and tease a little bit more the http server (#5718) hai 1 ano
  Pierrick Hymbert 930b178026 server: logs - unified format and --log-format option (#5700) hai 1 ano
  Pierrick Hymbert d52d7819b8 server: concurrency fix + monitoring - add /metrics prometheus compatible endpoint (#5708) hai 1 ano
  Pierrick Hymbert 525213d2f5 server: init functional tests (#5566) hai 1 ano
  Alexey Parfenov c5688c6250 server : clarify some params in the docs (#5640) hai 1 ano
  Xuan Son Nguyen 7c8bcc11dc Add docs for llama_chat_apply_template (#5645) hai 1 ano
  Pierrick Hymbert 1ecea255eb server: health: fix race condition on slots data using tasks queue (#5634) hai 1 ano
  Pierrick Hymbert c0a8c6db37 server : health endpoint configurable failure on no slot (#5594) hai 1 ano
  Robey Holderith 5ee99c32f5 common, server : surface min_keep as its own parameter (#5567) hai 1 ano
  Pierrick Hymbert c145f8a132 server : slots monitoring endpoint (#5550) hai 1 ano
  Pierrick Hymbert e75c6279d1 server : enhanced health endpoint (#5548) hai 1 ano
  Pierrick Hymbert 36376abe05 server : --n-predict option document and cap to max value (#5549) hai 1 ano
  Alexey Parfenov 6dcc02d244 server : add "samplers" param to control the samplers order (#5494) hai 1 ano
  bmwl f486f6e1e5 ggml : add numa options (#5377) hai 1 ano
  Alexey Parfenov 684780141a server : allow to specify tokens as strings in logit_bias (#5003) hai 1 ano
  Justin Parker f3e2b4fa3f server : update `/props` with "total_slots" value (#5373) hai 1 ano
  Michael Coppola 31e7903221 server : add `dynatemp_range` and `dynatemp_exponent` (#5352) hai 1 ano
  Alexey Parfenov a2d60c9158 server : allow to get default generation settings for completion (#5307) hai 1 ano
  Wu Jian Ping 6685cc41c2 server : improve README (#5209) %!s(int64=2) %!d(string=hai) anos
  Kyle Mistele 39baaf55a1 docker : add server-first container images (#5157) %!s(int64=2) %!d(string=hai) anos
  Maximilian Winter ec903c0341 server : add self-extend support (#5104) %!s(int64=2) %!d(string=hai) anos
  Michael Coppola 27379455c3 server : support for multiple api keys (#4864) %!s(int64=2) %!d(string=hai) anos
  Behnam M 7a9f75c38b server : update readme to document the new `/health` endpoint (#4866) %!s(int64=2) %!d(string=hai) anos
  Behnam M 128de3585b server : update readme about token probs (#4777) %!s(int64=2) %!d(string=hai) anos
  Zsapi 8c58330318 server : add api-key flag to documentation (#4832) %!s(int64=2) %!d(string=hai) anos
  Michael Coppola e5804313a1 server : fix options in README.md (#4765) %!s(int64=2) %!d(string=hai) anos
  Karthik Sethuraman b93edd22f5 server : allow to generate multimodal embeddings (#4681) %!s(int64=2) %!d(string=hai) anos
  Alexey Parfenov 6123979952 server : allow to specify custom prompt for penalty calculation (#3727) %!s(int64=2) %!d(string=hai) anos