Commit History

Author SHA1 Message Date
  Georgi Gerganov 39173bcacb context : reserve new scheduler when graph topology changes (#18547) 2 weeks ago
  Johannes Gäßler 5c662d21a3 CUDA: fix allignment on register spill for FA (#18815) 2 weeks ago
  shalinib-ibm 8cc0ba957b ggml-cpu: optimize ggml_vec_dot_bf16 for Power9 (#18837) 2 weeks ago
  Xuan-Son Nguyen a7e6ddb8bd lora: make sure model keep track of associated adapters (#18490) 2 weeks ago
  Sigbjørn Skjæret 2a13180100 model-loader : support bool array sliding window pattern (#18850) 2 weeks ago
  Adrien Gallouët ec997b4f2b tests : download models only when running ctest (#18843) 2 weeks ago
  Max Krasnyansky cff777f226 hexagon: support for OP_CPY, host buffers now optional, hvx-utils refactoring and optimizations (#18822) 2 weeks ago
  Oliver Simons 36f0132464 CUDA: Factor out and re-use `block_reduce` function (#18785) 2 weeks ago
  Piotr Wilkin (ilintar) d98b548120 Restore clip's cb() to its rightful glory - extract common debugging elements in llama (#17914) 2 weeks ago
  Junwon Hwang 8fb7175576 model : clean up and fix EXAONE-MoE configuration (#18840) 2 weeks ago
  Adrien Gallouët 516a4ca9b5 refactor : remove libcurl, use OpenSSL when available (#18828) 2 weeks ago
  Jeff Bolz 3e4bb29666 vulkan: Check maxStorageBufferRange in supports_op (#18709) 2 weeks ago
  Aman Gupta 47f9612492 llama-model: fix unfortunate typo (#18832) 2 weeks ago
  Daniel Bevenius 01cbdfd7eb CUDA : fix typo in clang pragma comment [no ci] (#18830) 2 weeks ago
  Ruben Ortlam 635ef78ec5 vulkan: work around Intel fp16 bug in mmq (#18814) 2 weeks ago
  Perry Naseck 7d587e5544 ggml-metal: do not copy headers for embedded, use current binary dir for embedded (#18705) 2 weeks ago
  Daniel Benjaminsson d34aa07193 mmap: add Haiku support by skipping RLIMIT_MEMLOCK check (#18819) 2 weeks ago
  Adrien Gallouët f709c7a33f ci, tests : use cmake to download models and remove libcurl dependency (#18791) 2 weeks ago
  ddh0 6e36299b47 llama : print_info alignment fix (#18708) 2 weeks ago
  Junwon Hwang 60591f01d4 model : add EXAONE MoE (#18543) 2 weeks ago
  Georgi Gerganov e4832e3ae4 vocab : fix attribute overrides for harmony (#18806) 2 weeks ago
  Ruben Ortlam 960e5e3b46 llama-mmap: fix direct-io loading fallback EOF exception (#18801) 2 weeks ago
  Daniel Bevenius 20ca2e12c4 model-conversion : remove -c 0 from model card template [no ci] (#18807) 2 weeks ago
  yulo ea4a321f2a HIP: add fattn-mma-f16 for RDNA4 (#18481) 2 weeks ago
  Johannes Gäßler c1e79e610f doc: ban AI-generated PR descriptions [no ci] (#18765) 2 weeks ago
  Xuan-Son Nguyen e047f9ee9d mtmd: fix use_non_causal being reported incorrectly (#18793) 2 weeks ago
  Georgi Gerganov 0a57271ab6 CUDA : fix unused argument when USE_CUDA_GRAPH=OFF (#18800) 2 weeks ago
  Gabe Goodhart 076b0faf7d graph : clean up t5 input builders (#18795) 2 weeks ago
  Ruben Ortlam db79dc06b1 llama-bench: add direct_io parameter (#18778) 2 weeks ago
  Adrien Gallouët 537d4240d4 ci : remove libcurl in releases (#18775) 2 weeks ago