Historique des commits

Auteur SHA1 Message Date
  slaren 4f407a0a35 llama : add model types for mixtral (#6589) il y a 1 an
  slaren 65c64dc36f convert.py : add consolidated.safetensors for mixtral 8x22b (#6587) il y a 1 an
  Pierrick Hymbert 67fac4b95f docs : how to add a model (#6565) il y a 1 an
  Artem Zinnatullin 29122d32ac readme : fix ROCm link (#6579) il y a 1 an
  sjxx b231b37b09 readme : update UI list (#6560) il y a 1 an
  Jiří Sejkora ba5e134e07 readme: fix typo in amdgpu target name (#6573) il y a 1 an
  Jared Van Bortel 1b67731e18 BERT tokenizer fixes (#6498) il y a 1 an
  Georgi Gerganov c4a3a4ff47 sync : ggml il y a 1 an
  Ed Lee 400d5d722d server : detect search query to start webchat (#6554) il y a 1 an
  Carolinabanana 5dc9dd7152 llama : add Command R Plus support (#6491) il y a 1 an
  Georgi Gerganov e11a8999b5 license : update copyright notice + add AUTHORS (#6405) il y a 1 an
  Georgi Gerganov cc4a95426d llama : fix attention layer count sanity check (#6550) il y a 1 an
  kunnis cecd8d3c98 Comment explaining a decision (#6531) il y a 1 an
  Georgi Gerganov b73e564b16 quantize : fix precedence of cli args (#6541) il y a 1 an
  Rick G e3c337d87c llama : support negative ith in llama_get_ API (#6519) il y a 1 an
  Jan Boon beea6e1b16 llama : save and restore kv cache for single seq id (#6341) il y a 1 an
  Abhilash Majumder 87fb5b4234 remove row=1 cond (#6532) il y a 1 an
  Firat d752327c33 Adding KodiBot to UI list (#6535) il y a 1 an
  Mark Fairbairn 855f54402e Change Windows AMD example to release build to make inference much faster. (#6525) il y a 1 an
  Georgi Gerganov b909236c0b flake.lock: Update (#6517) il y a 1 an
  DAN™ e0717e751e Add GritLM as supported models. (#6513) il y a 1 an
  Georgi Gerganov c37247796b sync : ggml il y a 1 an
  Slava Primenko f77261a7c5 ggml: bypass code incompatible with CUDA < 11.1 (whisper/2020) il y a 1 an
  Georgi Gerganov 43e8995e75 scripts : sync ggml-cuda folder il y a 1 an
  limitedAtonement 9472bce308 Run make to build the project (#6457) il y a 1 an
  Neo Zhang Jianyu d4f220a5cc support/fix OPs GGML_TYPE_IQ4_NL, GGML_TYPE_IQ4_XS, GGML_TYPE_IQ3_XXS, GGML_TYPE_IQ3_S, GGML_TYPE_IQ2_XXS, GGML_TYPE_IQ2_XS, GGML_TYPE_IQ2_S, GGML_TYPE_IQ1_S, GGML_TYPE_IQ1_M (#6521) il y a 1 an
  Georgi Gerganov 54ea0698fb sync : ggml il y a 1 an
  Daniel Bevenius b66aec675c backend : fix typo in scheduler documentation (ggml/781) il y a 1 an
  Clint Herron 57dd02c44b Tests: Added integration tests for GBNF parser (#6472) il y a 1 an
  Pierrick Hymbert 75cd4c7729 ci: bench: support sse and fix prompt processing time / server: add tokens usage in stream OAI response (#6495) il y a 1 an