Commit History

Autor SHA1 Mensaxe Data
  Georgi Gerganov 438c2ca830 server : parallel decoding and multimodal (#3677) %!s(int64=2) %!d(string=hai) anos
  Georgi Gerganov e74c705e15 editorconfig : remove trailing spaces %!s(int64=2) %!d(string=hai) anos
  coezbek 3ad1e3f1a1 server : documentation of JSON return value of /completion endpoint (#3632) %!s(int64=2) %!d(string=hai) anos
  Mihai cb13d73a72 server : docs fix default values and add n_probs (#3506) %!s(int64=2) %!d(string=hai) anos
  vvhg1 c97f01c362 infill : add new example + extend server API (#3296) %!s(int64=2) %!d(string=hai) anos
  slaren 16bc66d947 llama.cpp : split llama_context_params into model and context params (#3301) %!s(int64=2) %!d(string=hai) anos
  Bruce MacDonald c1ac54b77a server : add `/detokenize` endpoint (#2802) %!s(int64=2) %!d(string=hai) anos
  lon bae5c5f679 examples : skip unnecessary external lib in server README.md how-to (#2804) %!s(int64=2) %!d(string=hai) anos
  Xiao-Yong Jin b8ad1b66b2 server : allow json array in prompt or content for direct token input (#2306) %!s(int64=2) %!d(string=hai) anos
  Georgi Gerganov 6381d4e110 gguf : new file format with flexible meta data (beta) (#2398) %!s(int64=2) %!d(string=hai) anos
  Cheng Shao d75561df20 server : add --numa support (#2524) %!s(int64=2) %!d(string=hai) anos
  Martin Krasser f5bfea0580 Allow passing grammar to completion endpoint (#2532) %!s(int64=2) %!d(string=hai) anos
  Bono Lv c574bddb36 fix a typo in examples/server/README.md (#2478) %!s(int64=2) %!d(string=hai) anos
  Xiao-Yong Jin 6e7cca4047 llama : add custom RoPE (#2054) %!s(int64=2) %!d(string=hai) anos
  Howard Su 32c5411631 Revert "Support using mmap when applying LoRA (#2095)" (#2206) %!s(int64=2) %!d(string=hai) anos
  Howard Su 2347463201 Support using mmap when applying LoRA (#2095) %!s(int64=2) %!d(string=hai) anos
  Judd 36680f6e40 convert : update for baichuan (#2081) %!s(int64=2) %!d(string=hai) anos
  Tobias Lütke 31cfbb1013 Expose generation timings from server & update completions.js (#2116) %!s(int64=2) %!d(string=hai) anos
  Jesse Jojo Johnson 983b555e9d Update Server Instructions (#2113) %!s(int64=2) %!d(string=hai) anos
  Jesse Jojo Johnson 8567c76b53 Update server instructions for web front end (#2103) %!s(int64=2) %!d(string=hai) anos
  jwj7140 f257fd2550 Add an API example using server.cpp similar to OAI. (#2009) %!s(int64=2) %!d(string=hai) anos
  Howard Su b8c8dda75f Use unsigned for random seed (#2006) %!s(int64=2) %!d(string=hai) anos
  Henri Vasserman 20568fe60f [Fix] Reenable server embedding endpoint (#1937) %!s(int64=2) %!d(string=hai) anos
  Randall Fitzgerald 794db3e7b9 Server Example Refactor and Improvements (#1570) %!s(int64=2) %!d(string=hai) anos
  Srinivas Billa 9dda13e5e1 readme : server compile flag (#1874) %!s(int64=2) %!d(string=hai) anos
  Johannes Gäßler 254a7a7a5f CUDA full GPU acceleration, KV cache in VRAM (#1827) %!s(int64=2) %!d(string=hai) anos
  Johannes Gäßler 17366df842 Multi GPU support, CUDA refactor, CUDA scratch buffer (#1703) %!s(int64=2) %!d(string=hai) anos
  Kerfuffle 1b78ed2081 Only show -ngl option when relevant + other doc/arg handling updates (#1625) %!s(int64=2) %!d(string=hai) anos
  Steward Garcia 7e4ea5beff examples : add server example with REST API (#1443) %!s(int64=2) %!d(string=hai) anos