Commit History

Author SHA1 Message Date
  Sigbjørn Skjæret b25346221d llama : return mistral-v7-tekken as default template only (#14390) 6 months ago
  Georgi Gerganov e8215dbb96 metal : add special-case mat-vec mul for ne00 == 4 (#14385) 6 months ago
  Georgi Gerganov 5783ae4359 metal : batch rows copy in a single threadgroup (#14384) 6 months ago
  Aaron Teo bf5bcd0b85 docs: update s390x documentation + add faq (#14389) 6 months ago
  R0CKSTAR 716301d1b0 musa: enable fp16 mma (all) and cublas on qy2 (#13842) 6 months ago
  Aaron Teo 60ef23d6c1 ggml-cpu: enable IBM NNPA Vector Intrinsics (#14317) 6 months ago
  Sigbjørn Skjæret b193d53069 ggml : do not output unprintable characters on GGUF load failure (#14381) 6 months ago
  Anton Mitkov 2bf9d539dd sycl: GGML_SYCL_DISABLE_OPT on by default for all Intel Devices (#13973) 6 months ago
  lhez 73e53dc834 opencl: ref count `ggml_backend_opencl_context` and refactor profiling (#14254) 7 months ago
  Georgi Gerganov 62af464227 batch : fix check for empty sequences in memory (#14364) 7 months ago
  Mathieu Baudier c148cf1946 cmake : use LLAMA_BUILD_NUMBER when defining LLAMA_INSTALL_VERSION (#14362) 7 months ago
  Nigel Bosch 1b809cee22 server : move no API key doc to /health (#14352) 7 months ago
  Sigbjørn Skjæret abf241045d main : honor --verbose-prompt on interactive prompts (#14350) 7 months ago
  Bartowski 901e20bbe5 jinja : Add Mistral-Small-3.2-24B-Instruct-2506.jinja (#14349) 7 months ago
  uvos 0142961a2e CUDA/HIP: optimize mmv paths taken for HIP devices (#14324) 7 months ago
  bandoti ce82bd0117 ci: add workflow for relocatable cmake package (#14346) 7 months ago
  Jeff Bolz bf2a99e3cb vulkan: update windows SDK in release.yml (#14344) 7 months ago
  Molly Sophia 72c6bc3f3d llama : better rwkv chat template and add missing `inputs.use_jinja` setting (#14336) 7 months ago
  Johannes Gäßler defe2158dd CUDA: mul_mat_v support for batch sizes > 1 (#14262) 7 months ago
  Georgi Gerganov 7b50d589a8 kv-cells : fix tracking of seq_pos (#14339) 7 months ago
  Jeff Bolz 3a9457df96 vulkan: update windows SDK in CI (#14334) 7 months ago
  Ed Addario fa4a9f2a1c quantize : handle user-defined pruning of whole layers (blocks) (#13037) 7 months ago
  Sigbjørn Skjæret 238005c2dc gguf-py : fix SpecialVocab parsing when post_processor is null (#14330) 7 months ago
  Ruikai Peng 66aba7aca9 run : avoid double tokenization (#14327) 7 months ago
  Georgi Gerganov f1f5e82df6 examples : fix is_first logic for tokenization (#14329) 7 months ago
  uvos af3373f1ad HIP: enable vec fattn on RDNA4 (#14323) 7 months ago
  yuiseki 5d5c066de8 mtmd : fix Pixtral OOM with large images by capping image_size to 1024 (#14326) 7 months ago
  Sigbjørn Skjæret 40bfa04c95 common : use std::string_view now that we target c++17 (#14319) 7 months ago
  Aman Gupta aa064b2eb7 CUDA: add mean operation (#14313) 7 months ago
  Sigbjørn Skjæret aa0ef5c578 gguf-py : fix Qwen3-Embedding eos token (#14314) 7 months ago