Commit History

Автор SHA1 Съобщение Дата
  Georgi Gerganov f3e64859ed ci : fix arm upload artifacts (#12024) преди 11 месеца
  Johannes Gäßler 5fa07c2f93 CUDA: optimize FA for GQA + large batches (#12014) преди 11 месеца
  Rohanjames1997 335eb04a91 ci : Build on Github-hosted arm64 runners (#12009) преди 11 месеца
  Georgi Gerganov cf756d6e0a server : disable Nagle's algorithm (#12020) преди 11 месеца
  Gian-Carlo Pascutto d70908421f cuda: Add Q5_1, Q5_0, Q4_1 and Q4_0 to F32 conversion support. (#12000) преди 11 месеца
  Daniel Bevenius de8b5a3624 llama.swiftui : add "Done" dismiss button to help view (#11998) преди 11 месеца
  Georgi Gerganov 51f311e057 llama : skip loading unused tensors (#12004) преди 11 месеца
  Johannes Gäßler 586d5fe6eb doc: update contributing guidelines [no ci] (#11969) преди 11 месеца
  PureJourney ecc8e3aeff CUDA: correct the lowest Maxwell supported by CUDA 12 (#11984) преди 11 месеца
  Bodhi 0b3863ff95 MUSA: support ARM64 and enable dp4a .etc (#11843) преди 11 месеца
  Alex Brooks ee02ad02c5 clip : fix visual encoders with no CLS (#11982) преди 11 месеца
  momonga c392e5094d server (webui): Fix Premature Submission During IME Conversion (#11971) преди 11 месеца
  Charles Xu c5d91a7400 ggml-cpu: Add CPU backend support for KleidiAI library (#11390) преди 11 месеца
  Prashant Vithule 4806498bf1 ggml: aarch64: implement SVE kernels for q3_K_q8_K vector dot (#11917) преди 11 месеца
  Michael Engel 0d559580a0 run : add --chat-template-file (#11961) преди 11 месеца
  Johannes Gäßler d04e7163c8 doc: add links to ggml examples [no ci] (#11958) преди 11 месеца
  Daniel Bevenius d07c621393 common : add llama.vim preset for Qwen2.5 Coder (#11945) преди 11 месеца
  Georgi Gerganov abd4d0bc4f speculative : update default params (#11954) преди 11 месеца
  Daniel Bevenius 9626d9351a llama : fix indentation in llama-grammar [no ci] (#11943) преди 11 месеца
  igardev b58934c183 server : (webui) Enable communication with parent html (if webui is in iframe) (#11940) преди 11 месеца
  Olivier Chafik 63e489c025 tool-call: refactor common chat / tool-call api (+ tests / fixes) (#11900) преди 11 месеца
  Xuan-Son Nguyen 63ac128563 server : add TEI API format for /rerank endpoint (#11942) преди 11 месеца
  MoonRide303 5137da7b8c scripts: corrected encoding when getting chat template (#11866) (#11907) преди 11 месеца
  xiaobing318 09aaf4f1f5 docs : Fix duplicated file extension in test command (#11935) преди 11 месеца
  Johannes Gäßler 73e2ed3ce3 CUDA: use async data loading for FlashAttention (#11894) преди 11 месеца
  Eve f7b1116af1 update release requirements (#11897) преди 11 месеца
  Antoine Viallon c4d29baf32 server : fix divide-by-zero in metrics reporting (#11915) преди 11 месеца
  Rémy O 2eea03d86a vulkan: implement several ops relevant for ggml_opt (#11769) преди 11 месеца
  Xuan-Son Nguyen 0f2bbe6564 server : bump httplib to 0.19.0 (#11908) преди 11 месеца
  standby24x7 fe163d5bf3 common : Fix a typo in help (#11899) преди 11 месеца