Commit History

Autor SHA1 Mensaxe Data
  Pierrick Hymbert 3ab8b3a92e llama : cleanup unused mmq flags (#5772) hai 1 ano
  Pierrick Hymbert 5cb02b4a01 server: allow to override threads server pool with --threads-http (#5794) hai 1 ano
  Georgi Gerganov f105471ef6 server : fix newlines in help (#5785) hai 1 ano
  Xuan Son Nguyen 052051d8ae Server: normalize naming (#5779) hai 1 ano
  Xuan Son Nguyen a693bea1e6 server : hit Ctrl+C twice to exit (#5734) hai 1 ano
  Jorge A efc72253f7 server : add "/chat/completions" alias for "/v1/...` (#5722) hai 1 ano
  Xuan Son Nguyen b11a93df41 fix server hangs on empty prompt (#5733) hai 1 ano
  Georgi Gerganov bf08e00643 llama : refactor k-shift implementation + KV defragmentation (#5691) hai 1 ano
  compilade f7625019c5 server : fix crash when system prompt is bigger than batch size (#5714) hai 1 ano
  Pierrick Hymbert 930b178026 server: logs - unified format and --log-format option (#5700) hai 1 ano
  Pierrick Hymbert d52d7819b8 server: concurrency fix + monitoring - add /metrics prometheus compatible endpoint (#5708) hai 1 ano
  Georgi Gerganov ab336a9d5e code : normalize enum names (#5697) hai 1 ano
  Pierrick Hymbert 9e359a4f47 server: continue to update other slots on embedding concurrent request (#5699) hai 1 ano
  Pierrick Hymbert 525213d2f5 server: init functional tests (#5566) hai 1 ano
  AlpinDale fd43d66f46 server : add KV cache quantization options (#5684) hai 1 ano
  Xuan Son Nguyen a46f50747b server : fallback to chatml, add AlphaMonarch chat template (#5628) hai 1 ano
  Jared Van Bortel 89febfed93 examples : do not assume BOS when shifting context (#5622) hai 1 ano
  Pierrick Hymbert 1ecea255eb server: health: fix race condition on slots data using tasks queue (#5634) hai 1 ano
  CJ Pais 6560bed3f0 server : support llava 1.6 (#5553) hai 1 ano
  Xuan Son Nguyen 9c405c9f9a Server: use llama_chat_apply_template (#5593) hai 1 ano
  Pierrick Hymbert c0a8c6db37 server : health endpoint configurable failure on no slot (#5594) hai 1 ano
  Robey Holderith 5ee99c32f5 common, server : surface min_keep as its own parameter (#5567) hai 1 ano
  Pierrick Hymbert c145f8a132 server : slots monitoring endpoint (#5550) hai 1 ano
  Pierrick Hymbert e75c6279d1 server : enhanced health endpoint (#5548) hai 1 ano
  Pierrick Hymbert 36376abe05 server : --n-predict option document and cap to max value (#5549) hai 1 ano
  Daniel Hiltgen 66c1968f7a server : graceful server shutdown (#5244) hai 1 ano
  Alexey Parfenov 6dcc02d244 server : add "samplers" param to control the samplers order (#5494) hai 1 ano
  Rőczey Barnabás 5f5808ca7b server : fix system prompt cli (#5516) hai 1 ano
  bmwl f486f6e1e5 ggml : add numa options (#5377) hai 1 ano
  Elbios 0d4177126b llava : fix memory management bug (#5491) hai 1 ano