Commit History

Author SHA1 Message Date
  Georgi Gerganov a4a0aa5ea2 ggml : fix dependencies for ggml_set_rows (#16318) 3 months ago
  Jeff Bolz 92cd103f62 vulkan: Fix validation failure in quantized flash attention (#16292) 3 months ago
  Sigbjørn Skjæret b887d2f341 ggml : fix GGML_F32_VEC_FMA argument order in ggml_vec_mad1_f32 (#16307) 3 months ago
  crat0z bd0af02fc9 common : fix reasoning before forced tool call via tool_choice = required (#16264) 3 months ago
  R0CKSTAR d9e0e7c819 ci : fix musa docker build (#16306) 3 months ago
  Aaron Teo 0124ac989f devops: switch to using ubuntu-22.04-s390x image (#16302) 3 months ago
  Imad Saddik 2811c65286 Fixed a few typos in the README of the LLaMA.cpp HTTP Server [no ci] (#16297) 3 months ago
  Jeff Bolz d8359f5fde vulkan: 64-bit im2col (#16135) 3 months ago
  Georgi Gerganov 6a2c6145a0 metal : extend mat-mat multiplication support (#16225) 3 months ago
  Georgi Gerganov 3b53634fe3 metal : fuse non-sequential nodes (#16102) 3 months ago
  Jeff Bolz 1384abf8b8 vulkan: handle mat_mul with A matrix > 4GB (#16176) 3 months ago
  Jeff Bolz e6d65fb02d vulkan: support arbitrary KV dimension in flash attention (#16160) 3 months ago
  Acly 8656f5de68 vulkan : make the vulkan.hpp dynamic dispatcher instance private (#16224) 3 months ago
  Aleksander Grygier 4807e8f96a Show message actions by default (#16289) 3 months ago
  Aman Gupta c0bfc57af4 CUDA: mul_mat_id for mmf for bs <= 64 for f16 and bs <= 32 for f32 (#16277) 3 months ago
  Johannes Gäßler 75a3a6c2cd CUDA: refactor and deduplicate vector FA kernels (#16208) 3 months ago
  Dmytro Minochkin 0499b29c6f vulkan: throw system error instead of SIGABRT during init on older devices (#16156) 3 months ago
  Adrien Gallouët 234e2ff8ed server : remove old LLAMA_SERVER_SSL (#16290) 3 months ago
  Jeff Bolz 3f81b4e91c vulkan: support GET_ROWS for k-quants (#16235) 3 months ago
  Adrien Gallouët ace6a54565 build : add LLAMA_OPENSSL option (#16287) 3 months ago
  Vinkal 72b24d96c6 model : make minicpm embedding_scale, residual_scale and logit_scale optional with legacy defaults (#16273) 3 months ago
  Aaron Teo 624207e676 devops: add s390x & ppc64le CI (#15925) 3 months ago
  Aleksander Grygier 807e8c6d31 Enhance text file detection logic for file attachments (#16199) 3 months ago
  Aleksander Grygier 1a18927894 Allow viewing conversations even when llama server is down (#16255) 3 months ago
  Isaac McFadyen e0539eb6ae webui: switch to hash-based routing (alternative of #16079) (#16157) 3 months ago
  Aleksander Grygier 5d0a40f390 Always show message actions for mobile UI + improvements for user message sizing (#16076) 3 months ago
  Radoslav Gerganov d12a983659 codeowners : add rgerganov as owner of RPC [no ci] (#16279) 3 months ago
  Aleksei Nikiforov cc1cfa277b mtmd : fix uninitialized variable in bicubic_resize (#16275) 3 months ago
  Georgi Gerganov 54dbc37053 metal : report OOM errors (#16274) 3 months ago
  Adrien Gallouët b995a10760 common : use cpp-httplib as a cURL alternative for downloads (#16185) 3 months ago