Commit Verlauf

Autor SHA1 Nachricht Datum
  Georgi Gerganov 5783ae4359 metal : batch rows copy in a single threadgroup (#14384) vor 7 Monaten
  Aaron Teo bf5bcd0b85 docs: update s390x documentation + add faq (#14389) vor 7 Monaten
  R0CKSTAR 716301d1b0 musa: enable fp16 mma (all) and cublas on qy2 (#13842) vor 7 Monaten
  Aaron Teo 60ef23d6c1 ggml-cpu: enable IBM NNPA Vector Intrinsics (#14317) vor 7 Monaten
  Sigbjørn Skjæret b193d53069 ggml : do not output unprintable characters on GGUF load failure (#14381) vor 7 Monaten
  Anton Mitkov 2bf9d539dd sycl: GGML_SYCL_DISABLE_OPT on by default for all Intel Devices (#13973) vor 7 Monaten
  lhez 73e53dc834 opencl: ref count `ggml_backend_opencl_context` and refactor profiling (#14254) vor 7 Monaten
  Georgi Gerganov 62af464227 batch : fix check for empty sequences in memory (#14364) vor 7 Monaten
  Mathieu Baudier c148cf1946 cmake : use LLAMA_BUILD_NUMBER when defining LLAMA_INSTALL_VERSION (#14362) vor 7 Monaten
  Nigel Bosch 1b809cee22 server : move no API key doc to /health (#14352) vor 7 Monaten
  Sigbjørn Skjæret abf241045d main : honor --verbose-prompt on interactive prompts (#14350) vor 7 Monaten
  Bartowski 901e20bbe5 jinja : Add Mistral-Small-3.2-24B-Instruct-2506.jinja (#14349) vor 7 Monaten
  uvos 0142961a2e CUDA/HIP: optimize mmv paths taken for HIP devices (#14324) vor 7 Monaten
  bandoti ce82bd0117 ci: add workflow for relocatable cmake package (#14346) vor 7 Monaten
  Jeff Bolz bf2a99e3cb vulkan: update windows SDK in release.yml (#14344) vor 7 Monaten
  Molly Sophia 72c6bc3f3d llama : better rwkv chat template and add missing `inputs.use_jinja` setting (#14336) vor 7 Monaten
  Johannes Gäßler defe2158dd CUDA: mul_mat_v support for batch sizes > 1 (#14262) vor 7 Monaten
  Georgi Gerganov 7b50d589a8 kv-cells : fix tracking of seq_pos (#14339) vor 7 Monaten
  Jeff Bolz 3a9457df96 vulkan: update windows SDK in CI (#14334) vor 7 Monaten
  Ed Addario fa4a9f2a1c quantize : handle user-defined pruning of whole layers (blocks) (#13037) vor 7 Monaten
  Sigbjørn Skjæret 238005c2dc gguf-py : fix SpecialVocab parsing when post_processor is null (#14330) vor 7 Monaten
  Ruikai Peng 66aba7aca9 run : avoid double tokenization (#14327) vor 7 Monaten
  Georgi Gerganov f1f5e82df6 examples : fix is_first logic for tokenization (#14329) vor 7 Monaten
  uvos af3373f1ad HIP: enable vec fattn on RDNA4 (#14323) vor 7 Monaten
  yuiseki 5d5c066de8 mtmd : fix Pixtral OOM with large images by capping image_size to 1024 (#14326) vor 7 Monaten
  Sigbjørn Skjæret 40bfa04c95 common : use std::string_view now that we target c++17 (#14319) vor 7 Monaten
  Aman Gupta aa064b2eb7 CUDA: add mean operation (#14313) vor 7 Monaten
  Sigbjørn Skjæret aa0ef5c578 gguf-py : fix Qwen3-Embedding eos token (#14314) vor 7 Monaten
  Markus Tavenrath bb16041cae Add support for VK_EXT_debug_utils to add labels to Vulkan objects. (#13792) vor 7 Monaten
  Sigbjørn Skjæret 58cba76a9a gguf-py : fix TemplateProcessing pair when bos/eos is missing (#14312) vor 7 Monaten