Commit History

Author SHA1 Message Date
  Sam/Samuel 3f750f8d76 metal: add support for opt_step_sgd (#16539) 3 months ago
  Georgi Gerganov c515fc5771 ggml : fix scalar path for computing norm (#16558) 3 months ago
  hipudding f9bc66c3eb CANN: Update several operators to support FP16 data format (#16251) 3 months ago
  Sam/Samuel a31cf36ad9 metal : add opt_step_adamw and op_sum (#16529) 3 months ago
  Pascal 81d54bbfd5 webui: remove client-side context pre-check and rely on backend for limits (#16506) 3 months ago
  Neo Zhang Jianyu c7be9febcb [SYCL] fix UT fault cases: count-equal, argsort, pad OPs (#16521) 3 months ago
  Mathieu Baudier 8415f61e23 ci : add Vulkan on Ubuntu with default packages build (#16532) 3 months ago
  Aldehir Rojas 2c301e91ab common : handle unicode during partial json parsing (#16526) 3 months ago
  Georgi Gerganov 4b2dae383d common : update presets (#16504) 3 months ago
  sirus20x6 41aac5c69b ggml : Fix FP16 ELU positive branch (#16519) 3 months ago
  Daniel Bevenius a2fba89a42 hparams : add check for layer index in is_recurrent (#16511) 3 months ago
  sirus20x6 20cc625edc ggml: Correct SVE implementation in ggml_vec_dot_f16_unroll (#16518) 3 months ago
  Johannes Gäßler 11f0af5504 CUDA: faster tile FA, add oob checks, more HSs (#16492) 3 months ago
  Georgi Gerganov a3cb04744f metal : fix mul-mm condition + fix mul-mv permuted kernels (#16494) 3 months ago
  Pascal 4a8fbe0a5e feat: render user content as markdown option (#16358) 3 months ago
  Yann Follet 31d0ff1869 server / ranking : add sorting and management of top_n (#16403) 3 months ago
  Diego Devesa 97870e6497 cuda : avoid initializing unused devices (#16510) 3 months ago
  amirai21 477a66b035 convert : correctly handle LLaMA tokenizer for Jamba (#16470) 3 months ago
  Georgi Gerganov e60f01d941 server : fix division by zero when reporting stats (#16501) 3 months ago
  Georgi Gerganov 81086cd6a3 vocab : mark EOT token for Granite models (#16499) 3 months ago
  Radoslav Gerganov 68ee98ae18 server : return HTTP 400 if prompt exceeds context length (#16486) 3 months ago
  Radoslav Gerganov cdb6da468c server : log requests to /v1/completions (#16495) 3 months ago
  Prajwal B Mehendarkar 6d69ab3f26 cmake : Dont define XOPENSOURCE on AIX (#16481) 3 months ago
  Pascal 1faa13a118 webui: updated the chat service to only include max_tokens in the req… (#16489) 3 months ago
  duduta 1deee0f8d4 cpu : optimize the ggml NORM operation (#15953) 3 months ago
  Georgi Gerganov d00cbea63c server : host-memory prompt caching (#16391) 3 months ago
  Pascal 8328fd4bae No markdown in cot (#16483) 3 months ago
  Daniel Bevenius 56b4795842 model-conversion : add support for SentenceTransformers (#16387) 3 months ago
  sudhiarm 2c0d875ae6 ci: add ARM64 Kleidiai build and test support (#16462) 3 months ago
  Chenguang Li aa4711d369 CANN: Improve ACL graph matching (#16166) 3 months ago