Commit History

Author SHA1 Message Date
  Ruben Ortlam 635ef78ec5 vulkan: work around Intel fp16 bug in mmq (#18814) 2 weeks ago
  Perry Naseck 7d587e5544 ggml-metal: do not copy headers for embedded, use current binary dir for embedded (#18705) 2 weeks ago
  Daniel Benjaminsson d34aa07193 mmap: add Haiku support by skipping RLIMIT_MEMLOCK check (#18819) 2 weeks ago
  Adrien Gallouët f709c7a33f ci, tests : use cmake to download models and remove libcurl dependency (#18791) 2 weeks ago
  ddh0 6e36299b47 llama : print_info alignment fix (#18708) 2 weeks ago
  Junwon Hwang 60591f01d4 model : add EXAONE MoE (#18543) 2 weeks ago
  Georgi Gerganov e4832e3ae4 vocab : fix attribute overrides for harmony (#18806) 2 weeks ago
  Ruben Ortlam 960e5e3b46 llama-mmap: fix direct-io loading fallback EOF exception (#18801) 2 weeks ago
  Daniel Bevenius 20ca2e12c4 model-conversion : remove -c 0 from model card template [no ci] (#18807) 2 weeks ago
  yulo ea4a321f2a HIP: add fattn-mma-f16 for RDNA4 (#18481) 2 weeks ago
  Johannes Gäßler c1e79e610f doc: ban AI-generated PR descriptions [no ci] (#18765) 2 weeks ago
  Xuan-Son Nguyen e047f9ee9d mtmd: fix use_non_causal being reported incorrectly (#18793) 2 weeks ago
  Georgi Gerganov 0a57271ab6 CUDA : fix unused argument when USE_CUDA_GRAPH=OFF (#18800) 2 weeks ago
  Gabe Goodhart 076b0faf7d graph : clean up t5 input builders (#18795) 2 weeks ago
  Ruben Ortlam db79dc06b1 llama-bench: add direct_io parameter (#18778) 2 weeks ago
  Adrien Gallouët 537d4240d4 ci : remove libcurl in releases (#18775) 2 weeks ago
  Radoslav Gerganov bcf7546160 server : add arg for disabling prompt caching (#18776) 2 weeks ago
  Adrien Gallouët 36c5913c45 ci : use openssl for openEuler-latest-cmake-cann (#18779) 2 weeks ago
  Adrien Gallouët 8e649571cd vendor : update cpp-httplib to 0.30.1 (#18771) 2 weeks ago
  Daniel Bevenius 4150da9a95 examples : add --kv-unified to batched example (#18774) 2 weeks ago
  Jeff Bolz 8e2da778da vulkan: change memory_logger to be controlled by an env var (#18769) 2 weeks ago
  Xuan-Son Nguyen ce3bf9b1a4 server: update docs for sleeping [no ci] (#18777) 2 weeks ago
  Jeff Bolz 2bbe4c2cf8 vulkan: Use VK_EXT_shader_64bit_indexing to handle large mat_mul(_id) (#18678) 2 weeks ago
  Ruben Ortlam 1051ecd289 vulkan: Disable large coopmat matmul configuration on proprietary AMD driver (#18763) 2 weeks ago
  Xuan-Son Nguyen 0c3b7a9efe model: fix qwen3next broken due to #18683 (#18762) 2 weeks ago
  Ruben Ortlam 0e76501e1d Vulkan: Optimize Matmul parameters for AMD GPUs with Coopmat support (#18749) 2 weeks ago
  Xuan-Son Nguyen 4b060bf240 security: make it clear about subtopics in server (#18754) 2 weeks ago
  Daniel Bevenius 9789e28459 debug : include LLAMA_POOLING_TYPE_UNSPECIFIED in pooling check (#18692) 2 weeks ago
  Georgi Gerganov 84ae04f163 tests : refactor test-backend-sampler (#18753) 2 weeks ago
  Xuan-Son Nguyen 506bb6e010 model: try to improve Qwen3 Next (#18683) 2 weeks ago