Commit History

Author SHA1 Message Date
  Daniel Bevenius ffba4f29e6 examples : add debug utility/example (#18464) 3 weeks ago
  hipudding 3333951d86 CANN: Fix rename for get_env (#18652) 3 weeks ago
  Raul Torres 193ee38a1b CANN: Rename `get_env` to `get_env_as_lowercase` (#18624) 3 weeks ago
  Max Krasnyansky 95ea9e0861 Hexagon add support for f16/f32 flash attention, scale, set-rows and improve f16/32 matmul (#18611) 3 weeks ago
  Tarek Dakhran ccbc84a537 mtmd: mtmd_audio_streaming_istft (#18645) 3 weeks ago
  Johannes Gäßler 68b4d516c3 llama-params-fit: fix last devices with low VRAM (#18494) 3 weeks ago
  Aadeshveer Singh 24af22fc36 ggml : optimize cuda ssm_scan using warp-level reduction (#18505) 3 weeks ago
  Xuan-Son Nguyen 07fbe19f1f arg: use CSV escape style for multiple-value args (#18643) 3 weeks ago
  Jeff Bolz ea13cba850 vulkan: support buffer_from_host_ptr (#18467) 3 weeks ago
  Aman Gupta 090b137e56 ggml-cuda: refactor cuda graph usage (#18637) 3 weeks ago
  Beinsezii 968929528c mmq.cu: tune mmq/rocblas switching for RDNA (#18537) 3 weeks ago
  R 3d26a09dc7 server : add thinking content blocks to Anthropic Messages API (#18551) 3 weeks ago
  Christian Kastner bd2a93d475 gguf-py : add requests to dependencies (#18629) 3 weeks ago
  Adrien Gallouët e75ee11024 ggml : fix avx512bf16 build (#18623) 3 weeks ago
  Raul Torres da9b8d3300 CANN: Make `valid_values` variable `static const` (#18627) 3 weeks ago
  nwyin e443fbcfa5 ggml webgpu: add CEIL operation support (#18605) 3 weeks ago
  Tarek Dakhran 73d284a250 model : add LFM2-ColBert-350M (#18607) 3 weeks ago
  Johannes Gäßler df17a4c94f CUDA: fix FA FP16 accumulator overflow for Granite (#18614) 3 weeks ago
  tt 1871f0ba56 add YoutuVLForConditionalGeneration architectures (#18620) 3 weeks ago
  Aman Gupta f47edb8c19 ggml-cuda: check for srcs outside the cgraph (#18583) 3 weeks ago
  Vladislav Sayapin da143b9940 server : fix router child env in containerized environments (#18562) 3 weeks ago
  Jeff Bolz f1768d8f03 vulkan: fix topk_moe_sigmoid_norm_bias failures in GLM-4.6 (#18582) 3 weeks ago
  Georgi Gerganov 2da64a2f8a models : fix backend assignment for Granite/Nemotron graphs (#18599) 3 weeks ago
  Jeff Bolz b37124d2d2 vulkan: handle quantize_q8_1 overflowing the max workgroup count (#18515) 3 weeks ago
  Sigbjørn Skjæret eadc4184ca llama : refactor rope_freq_base/scale_swa conversion and init (#18553) 3 weeks ago
  Chenguang Li 67e3f6f601 CANN: add operator fusion support for ADD + RMS_NORM (#17512) 3 weeks ago
  Francisco Herrera 92ac1e016b doc: clarify that steps also apply to linux for opencl (#18002) 3 weeks ago
  Ali Tariq 8e3a761189 ci : init git lfs in every build for RISC-V (#18590) 3 weeks ago
  Daniel Bevenius d3dce4e0a5 sampling : add support for backend sampling (#17004) 3 weeks ago
  Tarek Dakhran 4974bf53cf model : mtmd : make input norm optional in LFM2-VL (#18594) 3 weeks ago