Commit History

Author SHA1 Message Date
  Jeff Bolz a0f3897d53 vulkan: fix top_k bug when there are ties in the input (#17659) 1 month ago
  Acly e15cd06a94 vulkan : support conv-2d with large output size (#17685) 1 month ago
  Reese Levine fd57b24c0f ggml webgpu: unary op suppport, code refactoring, ops support (#17764) 1 month ago
  Jeff Bolz 6ab0d64960 vulkan: enable mmvq for q2_k on NVIDIA (#17675) 1 month ago
  Jeff Bolz 93bb92664e vulkan: set all memory allocations to high priority (#17624) 1 month ago
  Georgi Gerganov 8160b38a5f rpc : fix alloc size logic (#17116) 1 month ago
  Georgi Gerganov c41bde6fbd metal : add residency sets keep-alive heartbeat (#17766) 1 month ago
  Johannes Gäßler 6016d0bd41 HIP : fix RDNA4 build (#17792) 1 month ago
  Pascal 1be97831e4 fix: prevent segfault in tokenizer on highly repetitive input (#17786) 1 month ago
  Adrien Gallouët a6cfc212ed ci : fix winget workflow (#17790) 1 month ago
  shalinib-ibm 3a0d10533a Q4/Q8 Tiled Gemm Optimization. (#16999) 1 month ago
  Piotr Wilkin (ilintar) 6648989673 Add pwilkin to CODEOWNERS for chat files (#17789) 1 month ago
  Johannes Gäßler e95d0bc8fd CUDA: fix FA VKQ accumulator overflow (#17746) 1 month ago
  Jiacheng (Jason) Chen 668ed76574 HIP: enable WMMA-MMQ INT kernels for RDNA 3 (#17576) 1 month ago
  Sigbjørn Skjæret 03d9a77b85 ci : transform release binary root dir in tar to llama-bXXXX (#17773) 1 month ago
  Gabe Goodhart 3143a755c8 docs : update ops.md (Metal, BLAS) (#17768) 1 month ago
  Piotr Wilkin (ilintar) 96fe9badfc Add support for CUMSUM and TRI for CUDA. (#17584) 1 month ago
  Gabe Goodhart bde188d60f metal: TRI, FILL, EXPM1, SOFTPLUS (#16623) 1 month ago
  Xuan-Son Nguyen 9d0229967a server: strip content-length header on proxy (#17734) 1 month ago
  Xuan-Son Nguyen c4c10bfb86 server: move msg diffs tracking to HTTP thread (#17740) 1 month ago
  Daniel Bevenius 817d743cc1 examples : add missing code block end marker [no ci] (#17756) 1 month ago
  Daniel Bevenius bd4ef13476 common : skip model validation when --help is requested (#17755) 1 month ago
  Alberto Cabrera Pérez 87a2084c45 ggml-cpu : remove asserts always evaluating to false (#17728) 1 month ago
  SmartestWashingMachine 3659aa28e9 convert: use existing local chat_template if mistral-format model has one. (#17749) 1 month ago
  Adrien Gallouët 2a73f81f8a cmake : simplify build info detection using standard variables (#17423) 1 month ago
  Sigbjørn Skjæret 7dba049b07 ci : disable ggml-ci-x64-amd-* (#17753) 1 month ago
  Adrien Gallouët 83c1171529 common: use native MultiByteToWideChar (#17738) 1 month ago
  Georgi Gerganov 0d1324856f metal : use params per pipeline instance (#17739) 1 month ago
  Georgi Gerganov a67ef0f47f llama : fix sanity checks during quantization (#17721) 1 month ago
  Adrien Gallouët ef75a89fdb build : move _WIN32_WINNT definition to headers (#17736) 1 month ago