Daniel Bevenius
|
ffba4f29e6
examples : add debug utility/example (#18464)
|
3 weeks ago |
hipudding
|
3333951d86
CANN: Fix rename for get_env (#18652)
|
3 weeks ago |
Raul Torres
|
193ee38a1b
CANN: Rename `get_env` to `get_env_as_lowercase` (#18624)
|
3 weeks ago |
Max Krasnyansky
|
95ea9e0861
Hexagon add support for f16/f32 flash attention, scale, set-rows and improve f16/32 matmul (#18611)
|
3 weeks ago |
Tarek Dakhran
|
ccbc84a537
mtmd: mtmd_audio_streaming_istft (#18645)
|
3 weeks ago |
Johannes Gäßler
|
68b4d516c3
llama-params-fit: fix last devices with low VRAM (#18494)
|
3 weeks ago |
Aadeshveer Singh
|
24af22fc36
ggml : optimize cuda ssm_scan using warp-level reduction (#18505)
|
3 weeks ago |
Xuan-Son Nguyen
|
07fbe19f1f
arg: use CSV escape style for multiple-value args (#18643)
|
3 weeks ago |
Jeff Bolz
|
ea13cba850
vulkan: support buffer_from_host_ptr (#18467)
|
3 weeks ago |
Aman Gupta
|
090b137e56
ggml-cuda: refactor cuda graph usage (#18637)
|
3 weeks ago |
Beinsezii
|
968929528c
mmq.cu: tune mmq/rocblas switching for RDNA (#18537)
|
3 weeks ago |
R
|
3d26a09dc7
server : add thinking content blocks to Anthropic Messages API (#18551)
|
3 weeks ago |
Christian Kastner
|
bd2a93d475
gguf-py : add requests to dependencies (#18629)
|
3 weeks ago |
Adrien Gallouët
|
e75ee11024
ggml : fix avx512bf16 build (#18623)
|
3 weeks ago |
Raul Torres
|
da9b8d3300
CANN: Make `valid_values` variable `static const` (#18627)
|
3 weeks ago |
nwyin
|
e443fbcfa5
ggml webgpu: add CEIL operation support (#18605)
|
3 weeks ago |
Tarek Dakhran
|
73d284a250
model : add LFM2-ColBert-350M (#18607)
|
3 weeks ago |
Johannes Gäßler
|
df17a4c94f
CUDA: fix FA FP16 accumulator overflow for Granite (#18614)
|
3 weeks ago |
tt
|
1871f0ba56
add YoutuVLForConditionalGeneration architectures (#18620)
|
3 weeks ago |
Aman Gupta
|
f47edb8c19
ggml-cuda: check for srcs outside the cgraph (#18583)
|
3 weeks ago |
Vladislav Sayapin
|
da143b9940
server : fix router child env in containerized environments (#18562)
|
3 weeks ago |
Jeff Bolz
|
f1768d8f03
vulkan: fix topk_moe_sigmoid_norm_bias failures in GLM-4.6 (#18582)
|
3 weeks ago |
Georgi Gerganov
|
2da64a2f8a
models : fix backend assignment for Granite/Nemotron graphs (#18599)
|
3 weeks ago |
Jeff Bolz
|
b37124d2d2
vulkan: handle quantize_q8_1 overflowing the max workgroup count (#18515)
|
3 weeks ago |
Sigbjørn Skjæret
|
eadc4184ca
llama : refactor rope_freq_base/scale_swa conversion and init (#18553)
|
3 weeks ago |
Chenguang Li
|
67e3f6f601
CANN: add operator fusion support for ADD + RMS_NORM (#17512)
|
3 weeks ago |
Francisco Herrera
|
92ac1e016b
doc: clarify that steps also apply to linux for opencl (#18002)
|
3 weeks ago |
Ali Tariq
|
8e3a761189
ci : init git lfs in every build for RISC-V (#18590)
|
3 weeks ago |
Daniel Bevenius
|
d3dce4e0a5
sampling : add support for backend sampling (#17004)
|
3 weeks ago |
Tarek Dakhran
|
4974bf53cf
model : mtmd : make input norm optional in LFM2-VL (#18594)
|
3 weeks ago |