Histórico de Commits

Autor SHA1 Mensagem Data
  Georgi Gerganov a3cb04744f metal : fix mul-mm condition + fix mul-mv permuted kernels (#16494) há 3 meses atrás
  Pascal 4a8fbe0a5e feat: render user content as markdown option (#16358) há 3 meses atrás
  Yann Follet 31d0ff1869 server / ranking : add sorting and management of top_n (#16403) há 3 meses atrás
  Diego Devesa 97870e6497 cuda : avoid initializing unused devices (#16510) há 3 meses atrás
  amirai21 477a66b035 convert : correctly handle LLaMA tokenizer for Jamba (#16470) há 3 meses atrás
  Georgi Gerganov e60f01d941 server : fix division by zero when reporting stats (#16501) há 3 meses atrás
  Georgi Gerganov 81086cd6a3 vocab : mark EOT token for Granite models (#16499) há 3 meses atrás
  Radoslav Gerganov 68ee98ae18 server : return HTTP 400 if prompt exceeds context length (#16486) há 3 meses atrás
  Radoslav Gerganov cdb6da468c server : log requests to /v1/completions (#16495) há 3 meses atrás
  Prajwal B Mehendarkar 6d69ab3f26 cmake : Dont define XOPENSOURCE on AIX (#16481) há 3 meses atrás
  Pascal 1faa13a118 webui: updated the chat service to only include max_tokens in the req… (#16489) há 3 meses atrás
  duduta 1deee0f8d4 cpu : optimize the ggml NORM operation (#15953) há 3 meses atrás
  Georgi Gerganov d00cbea63c server : host-memory prompt caching (#16391) há 3 meses atrás
  Pascal 8328fd4bae No markdown in cot (#16483) há 3 meses atrás
  Daniel Bevenius 56b4795842 model-conversion : add support for SentenceTransformers (#16387) há 3 meses atrás
  sudhiarm 2c0d875ae6 ci: add ARM64 Kleidiai build and test support (#16462) há 3 meses atrás
  Chenguang Li aa4711d369 CANN: Improve ACL graph matching (#16166) há 3 meses atrás
  Charles Xu d80d6d2400 kleidiai: kernel interface refactoring (#16460) há 3 meses atrás
  Neo Zhang Jianyu b260213755 [SYCL] refactor soft_max, add soft_max_back (#16472) há 3 meses atrás
  Saba Fallah e08db42595 model: EmbeddingGemma Adding Support for SentenceTransformers Dense Modules (#16367) há 3 meses atrás
  Pascal 12bbc3fa50 refactor: centralize CoT parsing in backend for streaming mode (#16394) há 3 meses atrás
  ai-fonsi 9d0882840e Disable CUDA host buffers on integrated GPUs (#16308) há 3 meses atrás
  issixx d2ee056e1d server : fix cancel pending task (#16467) há 3 meses atrás
  Georgi Gerganov b2c08c9ec4 metal : mark FA blocks (#16372) há 3 meses atrás
  Georgi Gerganov 7fdd16b432 server : improve context checkpoint logic (#16440) há 3 meses atrás
  Reese Levine 74b8fc17f9 ggml webgpu: profiling, CI updates, reworking of command submission (#16452) há 3 meses atrás
  Tarek Dakhran aeaf8a36f0 llama : support LiquidAI LFM2-MoE hybrid model (#16464) há 3 meses atrás
  Georgi Gerganov df1b612e29 server : add `/v1/health` endpoint (#16461) há 3 meses atrás
  Sascha Rogmann 4e0388aa8a webui : added download action (#13552) (#16282) há 3 meses atrás
  Georgi Gerganov ef4c5b87ea presets : fix pooling param for embedding models (#16455) há 3 meses atrás