Commit History

Autor SHA1 Mensaxe Data
  ccbinn 0440bfd160 metal : fix recommendedMaxWorkingSetSize availability on legacy iOS/macOS (#19088) hai 1 semana
  Sigbjørn Skjæret 0bf5636938 convert : yield Gemma3N custom_map tensors directly (#19091) hai 1 semana
  Aman Gupta bcb43163ae ggml-cpu: Use tiled FA for prompt-processing (#19012) hai 1 semana
  Georgi Gerganov d9c6ce46f7 kv-cache : support V-less cache (#19067) hai 1 semana
  Sigbjørn Skjæret 70d860824a convert : fix Gemma3N, GraniteMoe and Ernie4.5Moe (#19084) hai 1 semana
  Georgi Gerganov 080b161995 completion : fix prompt cache for recurrent models (#19045) hai 1 semana
  Molly Sophia 1243f93a2d readme: update RWKV7 model links (#19061) hai 1 semana
  Jakkala Mahesh 24bc238303 llama: fix integer type consistency in split helpers (#18894) hai 1 semana
  Daniel Bevenius 16639ba217 common : use two decimal places for float arg help messages (#19048) hai 1 semana
  Bartowski 9981c30130 convert : fix conversion for inheriting models that were bypassing modify_tensors (#19064) hai 1 semana
  Johannes Gäßler e9fd8dcab4 llama-fit-params: keep explicit --ctx-size 0 (#19070) hai 1 semana
  Johannes Gäßler 4e5b83b226 GGUF: check that tensor size is representable (#19072) hai 1 semana
  Xuan-Son Nguyen bb02f74c61 chat: fix language input for translategemma (#19052) hai 1 semana
  Johannes Gäßler 8f91ca54ec CUDA: re-use MLA K data for V in MMA FA (#19057) hai 1 semana
  Aman Gupta 81ab64f3c8 ggml-cuda: enable cuda-graphs for `n-cpu-moe` (#18934) hai 1 semana
  nullname 8af1f5f430 ggml-hexagon: flash-attn opt (#19025) hai 1 semana
  Georgi Gerganov 557515be1e graph : utilize `ggml_build_forward_select()` to avoid reallocations (#18898) hai 1 semana
  Neo Zhang cb6caca191 [SYCL] use malloc to support both iGPU and dGPU in same time (#18992) hai 1 semana
  Xuan-Son Nguyen b5b8fa1c8b chat : fix translategemma crash on common_chat_format_example (#19019) hai 1 semana
  Daniel Bevenius a14b960bc7 model-conversion : use BUILD_DIR variable in all scripts (#19015) hai 1 semana
  Alberto Cabrera Pérez 091a46cb8d ggml-cpu: aarm64: q5_K repack gemm and gemv (and generic) implementations (i8mm) (#18860) hai 1 semana
  Aldehir Rojas a3e812811d cli : load parser definition (#19031) hai 1 semana
  Xuan-Son Nguyen 51fa458a92 server : support preserving reasoning_content in assistant message (#18994) hai 1 semana
  Georgi Gerganov a5eaa1d6a3 mla : make the V tensor a view of K (#18986) hai 1 semana
  Johannes Gäßler e2baf02162 CUDA: fix alignment check for FA (#19023) hai 1 semana
  Aman Gupta e34d6d03b2 convert_hf_to_gguf.py: refactor modify_tensors to call super (#18866) hai 1 semana
  lhez 9c96465f99 opencl: enable the general fp mm for non-cont input and as a fallback for specialized kqv kernel for adreno (#18970) hai 1 semana
  Xuan-Son Nguyen 4e595b250a server: do not log certain endpoints (avoid log spam) (#19028) hai 1 semana
  Georgi Gerganov 0e4ebeb057 quant : manual overrides of tensor types take precedence (#18952) hai 1 semana
  Aaron Teo 8b30840703 release: update github api (#19022) hai 1 semana