Commit History

Author SHA1 Message Date
  Pascal 683fa6ba4e fix: added a normalization step for MathJax-style \[\] and \(\) delimiters (#16599) 3 months ago
  GittyBurstein b22572e97d sycl : add ARANGE operator (#16362) 3 months ago
  Chenguang Li 7a50cf388a CANN: format code using .clang-format (#15863) 3 months ago
  takasurazeem 6f5d924637 common : Update the docs on -t --threads (#16236) 3 months ago
  takuya kodama adc9b60f19 ggml-cpu: replace putenv with setenv for const-correctness (#16573) 3 months ago
  yael-works ee50ee1ead SYCL: Add GGML_OP_MEAN operator support (#16009) 3 months ago
  Aleksei Nikiforov 7adc79c032 gguf-py : add support for endian conversion of BF16 data (#16594) 3 months ago
  safranowith 466c1911ab cpu : add FLOOR, CEIL, ROUND and TRUNC unary operators (#16083) 3 months ago
  lhez 0cb7a0683b opencl: add q8_0 mm support (#16469) 3 months ago
  lhez d93f8439b0 opencl: fix FA for f32 (#16584) 3 months ago
  Aleksander Grygier f9fb33f263 Add server-driven parameter defaults and syncing (#16515) 3 months ago
  Sam/Samuel f4ce81c45e metal: optimise `GGML_OP_SUM` (#16559) 3 months ago
  Georgi Gerganov 17304cbcc1 server : fix img token logs (#16595) 3 months ago
  Xuan-Son Nguyen 3e3cb19f64 llama-quant: add support for mmproj (#16592) 3 months ago
  Julius Tischbein 5acd455460 CUDA: Changing the CUDA scheduling strategy to spin (#16585) 3 months ago
  Georgi Gerganov 554fd578a5 server : fix mtmd checkpoints (#16591) 3 months ago
  Georgi Gerganov fa882fd2b1 metal : avoid using Metal's gpuAddress property (#16576) 3 months ago
  SavicStefan ffa059034c vulkan: Add ACC_TYPE_VEC2 implementation (#16203) 3 months ago
  Aman Gupta 120bf7046d CUDA + openCL: fix bug in accessing rms_norm->src while doing fusion (#16577) 3 months ago
  Jeff Bolz 4258e0cfe7 vulkan: Support FA with K/V in F32 (#16543) 3 months ago
  Jeff Bolz 7ea15bb64c vulkan: Improve build time for MSVC (#16545) 3 months ago
  Johannes Gäßler 9c7185dd28 CUDA: enable FA for FP32 KV cache (#16546) 3 months ago
  Aman Gupta 1ee9d0b415 CUDA: use fastdiv + ggml_cuda_mad for mmvf (#16557) 3 months ago
  Aman Gupta 48e2fa9fb7 CUDA: add fp kernel for larger batch size MoE (#16512) 3 months ago
  Anav Prasad 5b6913c47b cuda : remove legacy copy-op pointer indirection code (#16485) 3 months ago
  Georgi Gerganov bc07349a7f server : dynamic token limit for prompt cache (#16560) 3 months ago
  Georgi Gerganov e60f241eac metal : FA support F32 K and V and head size = 32 (#16531) 3 months ago
  Georgi Gerganov e38b7c6e9e graph : support cacheless embeddings with FA and iSWA (#16528) 3 months ago
  lhez 5016b72862 opencl: fix build targeting CL 2 (#16554) 3 months ago
  Johannes Gäßler 7049736b2d CUDA: fix numerical issues in tile FA kernel (#16540) 3 months ago