Commit History

Author SHA1 Message Date
  Sigbjørn Skjæret b2d980fce0 codeowners : claim responsibility for ci, models, gguf-py and convert (#16124) 4 months ago
  Georgi Gerganov 5c6106a696 contrib : update roles (#16113) 4 months ago
  Georgi Gerganov ec65fb52f0 ci : remove vulkaninfo calls (#16169) 4 months ago
  Georgi Gerganov 1d660d2fae ci : use smaller model (#16168) 4 months ago
  Jeff Bolz a20d810d79 vulkan: add RTE variants of exp shader (#16165) 4 months ago
  Georgi Gerganov 4d0a7cbc61 ci : adjust params for less runtime (#16167) 4 months ago
  Ruben Ortlam 9073a73d82 vulkan: vec dot matrix multiplication fix (#16151) 4 months ago
  lhez 51f5a45fbe opencl: fix concat crash on win arm64 with Adreno (#15944) 4 months ago
  lhez c4510dc937 opencl: initial `q8_0` mv support (#15732) 4 months ago
  Georgi Gerganov da30ab5f86 ci : add label for the RISC-V runner (#16150) 4 months ago
  Georgi Gerganov 28baac9c9f ci : migrate ggml ci to self-hosted runners (#16116) 4 months ago
  Giuseppe Scrivano 1eeb523c3e vulkan: optimize UMA buffer operations and fix driver hangs (#16059) 4 months ago
  Jeff Bolz 5bb4a3edec vulkan: fix validation error about VK_PIPELINE_CREATE_CAPTURE_STATISTICS_BIT_KHR (#16086) 4 months ago
  Georgi Gerganov 7f766929ca sync : ggml 4 months ago
  Daniel Bevenius 405921dcef ggml : introduce semantic versioning (ggml/1336) 4 months ago
  Gregor Jasny fa6383ca7e CUDA : conditionally add cuda architectures (ggml/1341) 4 months ago
  Ruben Ortlam 803dac2e48 vulkan: use vec dot for matrix matrix multiplications (#16056) 4 months ago
  Benni 459c0c2c1a server: fix SSE and OpenAI compatibility for error messages when streaming (#16109) 4 months ago
  ssweens be79d9fdd9 llama-bench: add --devices and --list-devices support (#16039) 4 months ago
  shun095 f432d8d83e chat: Fix streaming parser for granite models (#15682) 4 months ago
  Aleksander Grygier 4067f07fc5 feat: Improve mobile UI for Settings Dialog (#16084) 4 months ago
  Xuan-Son Nguyen 4b8560ab56 chat : fix build on arm64 (#16101) 4 months ago
  Xuan-Son Nguyen 0dd58b6877 ggml : refactor forward_dup for cpu backend (#16062) 4 months ago
  Adrien Gallouët 69ffd89163 ggml-amx : fix ggml_amx_init() on generic Linux (#16049) 4 months ago
  Adrien Gallouët 246c0d9c79 cmake : fix static linking for OpenMP on Unix-like systems (#16031) 4 months ago
  Shawn Gu 3edd87cd05 opencl: optimize mxfp4 kernels (#16037) 4 months ago
  Jeff Bolz c0b45097c3 rename optimize_graph to graph_optimize (#16082) 4 months ago
  Bowen Han 38dbdf4c05 CUDA: Optimize PAD_REFLECT_1D (#15957) 4 months ago
  Johannes Gäßler 368560a1e3 CUDA: fix compilation on CC 6.0 (#16091) 4 months ago
  Eric Curtin 4ca088b036 Add resumable downloads for llama-server model loading (#15963) 4 months ago