Commit History

Author SHA1 Message Date
  hipudding 6ba6a3c76f docs : update ops.md for CANN backend (#18654) 1 week ago
  Perry Naseck 0802d4cfb3 ggml-blas: hide warnings from included BLAS headers (#18818) 1 week ago
  Tarek Dakhran c945aaaef2 mtmd : Fix ASR for LFM2.5-Audio-1.5B (#18876) 1 week ago
  Xuan-Son Nguyen c15395f73c common : implement new jinja template engine (#18462) 1 week ago
  Julius Tischbein aa1dc3770a Setting mmap and direct_io to false as default in llama-bench.cpp (#18841) 1 week ago
  Raul Torres 4ea2eaac01 CANN: Remove unused `ggml_cann_get_device` function (#18625) 1 week ago
  Chenguang Li e20fa27a02 CANN: fix an issue where get_env was not fully renamed (#18796) 1 week ago
  hipudding baa4ba0aec CANN: support gated linear attn (#18653) 1 week ago
  shaofeiqi 785a710085 OpenCL: add SOLVE_TRI op support (#18846) 2 weeks ago
  Georgi Gerganov 6e7fc8a146 cuda : print less debug logs when disabling cuda graphs (#18868) 2 weeks ago
  Georgi Gerganov be8e3d9515 context : do not reserve scheduler for warmups (#18867) 2 weeks ago
  ddh0 13f1e4a9ca llama : add adaptive-p sampler (#17927) 2 weeks ago
  Xuan-Son Nguyen a04c2b06a3 server: improve slots scheduling for n_cmpl (#18789) 2 weeks ago
  Georgi Gerganov 39173bcacb context : reserve new scheduler when graph topology changes (#18547) 2 weeks ago
  Johannes Gäßler 5c662d21a3 CUDA: fix allignment on register spill for FA (#18815) 2 weeks ago
  shalinib-ibm 8cc0ba957b ggml-cpu: optimize ggml_vec_dot_bf16 for Power9 (#18837) 2 weeks ago
  Xuan-Son Nguyen a7e6ddb8bd lora: make sure model keep track of associated adapters (#18490) 2 weeks ago
  Sigbjørn Skjæret 2a13180100 model-loader : support bool array sliding window pattern (#18850) 2 weeks ago
  Adrien Gallouët ec997b4f2b tests : download models only when running ctest (#18843) 2 weeks ago
  Max Krasnyansky cff777f226 hexagon: support for OP_CPY, host buffers now optional, hvx-utils refactoring and optimizations (#18822) 2 weeks ago
  Oliver Simons 36f0132464 CUDA: Factor out and re-use `block_reduce` function (#18785) 2 weeks ago
  Piotr Wilkin (ilintar) d98b548120 Restore clip's cb() to its rightful glory - extract common debugging elements in llama (#17914) 2 weeks ago
  Junwon Hwang 8fb7175576 model : clean up and fix EXAONE-MoE configuration (#18840) 2 weeks ago
  Adrien Gallouët 516a4ca9b5 refactor : remove libcurl, use OpenSSL when available (#18828) 2 weeks ago
  Jeff Bolz 3e4bb29666 vulkan: Check maxStorageBufferRange in supports_op (#18709) 2 weeks ago
  Aman Gupta 47f9612492 llama-model: fix unfortunate typo (#18832) 2 weeks ago
  Daniel Bevenius 01cbdfd7eb CUDA : fix typo in clang pragma comment [no ci] (#18830) 2 weeks ago
  Ruben Ortlam 635ef78ec5 vulkan: work around Intel fp16 bug in mmq (#18814) 2 weeks ago
  Perry Naseck 7d587e5544 ggml-metal: do not copy headers for embedded, use current binary dir for embedded (#18705) 2 weeks ago
  Daniel Benjaminsson d34aa07193 mmap: add Haiku support by skipping RLIMIT_MEMLOCK check (#18819) 2 weeks ago