Historique des commits

Auteur SHA1 Message Date
  Xuan Son Nguyen 48e6b92cc3 Add chat template support for llama-cli (#8068) il y a 1 an
  sasha0552 7a16ce7db2 server : smart slot selection using Longest Common Prefix (#7728) il y a 1 an
  Georgi Gerganov 1442677f92 common : refactor cli arg parsing (#7675) il y a 1 an
  Benjamin Findley e586ee4259 change default temperature of OAI compat API from 0 to 1 (#7226) il y a 1 an
  Johannes Gäßler c12452c7ae JSON: [key] -> .at(key), assert() -> GGML_ASSERT (#7143) il y a 1 an
  Xuan Son Nguyen 1fd9c1741d clean up json_value & server_log (#7142) il y a 1 an
  Pedro Cuenca b97bc3966e llama : support Llama 3 HF conversion (#6745) il y a 1 an
  Pierrick Hymbert 75cd4c7729 ci: bench: support sse and fix prompt processing time / server: add tokens usage in stream OAI response (#6495) il y a 1 an
  JH23X 60cdf40cc3 server : handle exception on wrong type in request (#6452) il y a 1 an
  Xuan Son Nguyen ad3a0505e3 Server: clean up OAI params parsing function (#6284) il y a 1 an
  Pierrick Hymbert 1b26aebe4d server: flush stdout after logging in both text and json layout (#6253) il y a 1 an
  Olivier Chafik 72114edf06 json-schema-to-grammar : fix order of props + non-str const/enum (#6232) il y a 1 an
  Olivier Chafik 5b7b0ac8df json-schema-to-grammar improvements (+ added to server) (#5978) il y a 1 an
  Karthick 47cc7a7bf9 Server: Handle n_keep parameter in the request (#6174) il y a 1 an
  Xuan Son Nguyen 99b71c068f Server: Use multi-task for embeddings endpoint (#6001) il y a 1 an
  Xuan Son Nguyen caa106d4e0 Server: format error to json (#5961) il y a 1 an
  Minsoo Cheong 332bdfd798 server : maintain chat completion id for streaming responses (#5988) il y a 1 an
  Georgi Gerganov 2002bc96bf server : refactor (#5882) il y a 1 an
  Pierrick Hymbert 9731134296 server: tests: passkey challenge / self-extend with context shift demo (#5832) il y a 1 an
  Xuan Son Nguyen 052051d8ae Server: normalize naming (#5779) il y a 1 an
  Pierrick Hymbert 930b178026 server: logs - unified format and --log-format option (#5700) il y a 1 an
  Pierrick Hymbert d52d7819b8 server: concurrency fix + monitoring - add /metrics prometheus compatible endpoint (#5708) il y a 1 an
  Pierrick Hymbert 1ecea255eb server: health: fix race condition on slots data using tasks queue (#5634) il y a 1 an
  Xuan Son Nguyen 9c405c9f9a Server: use llama_chat_apply_template (#5593) il y a 1 an
  Daniel Hiltgen 66c1968f7a server : graceful server shutdown (#5244) il y a 1 an
  Xuan Son Nguyen 907e08c110 server : add llama2 chat template (#5425) il y a 1 an
  Georgi Gerganov 753eafed0e sync : ggml il y a 2 ans
  Xuan Son Nguyen 48c857aa10 server : refactored the task processing logic (#5065) il y a 2 ans