Dobri Danchev
|
618575c582
Fix broken build: require updated pip to support --break-system-packages (#15357)
|
5 months ago |
compilade
|
f44f793172
ggml-quants : fix make_qp_quants NANs and IQ1 assertion errors (#15379)
|
5 months ago |
Jeff Bolz
|
ae532eac2c
vulkan: disable spirv-opt for bfloat16 shaders (#15352)
|
5 months ago |
Oleksandr Kuvshynov
|
e5155e6986
server : export max observed n_past value (#15361)
|
5 months ago |
Jeff Bolz
|
21c17b5bef
vulkan: Use larger workgroups for mul_mat_vec when M is small (#15355)
|
5 months ago |
Dong Won Kim
|
19f4decae0
vulkan: support sqrt (#15370)
|
5 months ago |
Sigbjørn Skjæret
|
4d196981d4
convert : force patch_embd weights to F16 or F32 to avoid broken GGUFs (#15367)
|
5 months ago |
Sigbjørn Skjæret
|
b143fbc87a
ci : fix hang in windows-hip build/release (#15365)
|
5 months ago |
Jeff Bolz
|
de5627910d
vulkan: Optimize argsort (#15354)
|
5 months ago |
Tarek Dakhran
|
65349f26f2
model : support vision LiquidAI LFM2-VL family (#15347)
|
5 months ago |
Jeff Bolz
|
1fe00296f5
vulkan: fuse adds (#15252)
|
5 months ago |
Jeff Bolz
|
de2192794f
vulkan: Support mul_mat_id with f32 accumulators (#15337)
|
5 months ago |
Jeff Bolz
|
2e2b22ba66
vulkan: Add missing bounds checking to scalar/coopmat1 mul_mat_id (#15334)
|
5 months ago |
rmatif
|
912ff8c119
OpenCL: add initial FA support (#14987)
|
5 months ago |
Daniel Bevenius
|
5e6229a840
common : fix double bos, use common_chat_templates for add_bos and add_eos (#15326)
|
5 months ago |
lhez
|
e2c1bfff53
opencl: add initial mxfp4 support via mv (#15270)
|
5 months ago |
Georgi Gerganov
|
5edf1592fd
vulkan : fix out-of-bounds access in argmax kernel (#15342)
|
5 months ago |
Georgi Gerganov
|
db3010bd23
vulkan : fix compile warnings on macos (#15340)
|
5 months ago |
Aaron Teo
|
ff27f80a74
ggml: initial IBM zDNN backend (#14975)
|
5 months ago |
Sigbjørn Skjæret
|
d3248d9b65
ci : fix ios-xcode-build (#15324)
|
5 months ago |
Diego Devesa
|
7aeee88cfe
ci : move ccache action to ggml-org fork (#15328)
|
5 months ago |
Johannes Gäßler
|
b07791aa1d
test-opt: fix backend support check (#15317)
|
5 months ago |
Johannes Gäßler
|
4227c9be42
CUDA: fix negative KV_max values in FA (#15321)
|
5 months ago |
Georgi Gerganov
|
df36bce667
eval-callback : stop on first NaN (#15320)
|
5 months ago |
Diego Devesa
|
f75b830647
chat : include kwargs in template example (#15309)
|
5 months ago |
Daniel Bevenius
|
7a0de96045
llama : add 18-layer model type for Gemma 3-270m (#15319)
|
5 months ago |
simevo
|
e4e915912c
devops : fix compile bug when the BASE_CUDA_DEV_CONTAINER is based on Ubuntu 24.04 (#15005)
|
5 months ago |
uvos
|
5ba36f6103
HIP: Cleanup hipification header (#15285)
|
5 months ago |
Aldehir Rojas
|
b204a5a234
gpt-oss: implement harmony parsing (#15181)
|
5 months ago |
Christian Kastner
|
646944cfa8
docker : Enable GGML_CPU_ALL_VARIANTS for ARM (#15267)
|
5 months ago |