Commit History

Author SHA1 Message Date
  hipudding 204f2cf168 CANN: Add ggml_set_rows (#14943) 5 months ago
  Sigbjørn Skjæret 138b288b59 cuda : add softcap fusion (#14907) 5 months ago
  Johannes Gäßler bbd0f91779 server-bench: make seed choice configurable (#14929) 5 months ago
  Aman Gupta 0a5036bee9 CUDA: add roll (#14919) 5 months ago
  lhez 8ad7b3e65b opencl : add ops docs (#14910) 5 months ago
  Leonard Mosescu bda62193b2 test-backend-ops : extend test case filtering (#14865) 5 months ago
  Radoslav Gerganov c556418b60 llama-bench : use local GPUs along with RPC servers (#14917) 5 months ago
  xctan db16e2831c ggml-cpu : deduplicate scalar implementations (#14897) 5 months ago
  Akarshan Biswas cd1fce6d4f SYCL: Add set_rows support for quantized types (#14883) 5 months ago
  Xuan-Son Nguyen 00fa15fedc mtmd : add support for Voxtral (#14862) 5 months ago
  Johannes Gäßler 946b1f6859 CUDA: fix pointer incrementation in FA (#14916) 5 months ago
  Dongliang Wei 6c6e397aff model : add support for SmallThinker series (#14898) 5 months ago
  Alberto Cabrera Pérez afc0e89698 sycl: refactor quantization to q8_1 (#14815) 5 months ago
  Georgi Gerganov a5771c9eea ops : update BLAS (#14914) 5 months ago
  Georgi Gerganov c35f9eaf09 ops : update Metal (#14912) 5 months ago
  Georgi Gerganov 1f45f2890e sync : ggml 5 months ago
  Kai Pastor 613c5095c3 cmake : Indent ggml-config.cmake (ggml/1310) 6 months ago
  Ed Addario 7f97599581 quantize : update README.md (#14905) 5 months ago
  Ruben Ortlam bf78f5439e vulkan: add ops docs (#14900) 5 months ago
  Akarshan Biswas bbfc849274 SYCL: add ops doc (#14901) 5 months ago
  Daniel Bevenius ca0ef2dddb llama : clarify comment about pp and tg graphs [no ci] (#14895) 5 months ago
  Erik Scholz 89d1029559 vulkan : add fp16 support for the conv_2d kernel (#14872) 5 months ago
  Jeff Bolz f1a4e72de5 vulkan: skip empty set_rows to avoid invalid API usage (#14860) 5 months ago
  Gabriel Larson 4762ad7316 model : make rope_yarn_log_mul optional for deepseek2 (#14896) 5 months ago
  Shunta Saito 1dc9614e06 llama : fix kq_scale for the attention layers of PLaMo2 (#14892) 5 months ago
  Aman Gupta 446595b9b3 Docs: add instructions for adding backends (#14889) 5 months ago
  deepsek 66906cd82a HIP: Enable Matrix cores for MMQ Kernels, Enable stream-K for CDNA 3 (#14624) 5 months ago
  hipudding 11dd5a44eb CANN: Implement GLU ops (#14884) 5 months ago
  R0CKSTAR 9b8f3c6c77 musa: fix build warnings (unused variable) (#14869) 5 months ago
  Aaron Teo c7f3169cd5 ggml-cpu : disable GGML_NNPA by default due to instability (#14880) 6 months ago