Reese Levine
|
9515c6131a
ggml: WebGPU disable SET_ROWS for now (#15078)
|
5 сар өмнө |
Georgi Gerganov
|
fd1234cb46
llama : add gpt-oss (#15091)
|
5 сар өмнө |
Sigbjørn Skjæret
|
f324a3b715
chat : only remove double bos/eos if added (#15086)
|
5 сар өмнө |
Georgi Gerganov
|
be42642581
readme : update hot topics (#15097)
|
5 сар өмнө |
Romain Biessy
|
3306ceabf0
sycl: fix mul_mat selection (#15092)
|
5 сар өмнө |
Juk Armstrong
|
c81de6e107
Fix `glm4moe` bug (#15088)
|
5 сар өмнө |
Alex Wu
|
22f060c9c4
webui: fix markdown table (#15081)
|
5 сар өмнө |
compilade
|
ee3a9fcf88
context : fix index overflow on huge outputs (#15080)
|
5 сар өмнө |
Diego Devesa
|
ec428b02c3
llama : add --n-cpu-moe option (#15077)
|
5 сар өмнө |
compilade
|
19f68fa5a4
imatrix : warn when GGUF imatrix is saved without .gguf suffix (#15076)
|
5 сар өмнө |
Christian Kastner
|
41613437ff
cmake: Add GGML_BACKEND_DIR option (#15074)
|
5 сар өмнө |
Sigbjørn Skjæret
|
e5bebe5251
gguf-py : add --chat-template-file to gguf_new_metadata (#15075)
|
5 сар өмнө |
Sam
|
ef0144c087
model: support GLM 4.5 family of models (#14939)
|
5 сар өмнө |
Sigbjørn Skjæret
|
2721257e3e
quantize : fix confusing error message if ftype is invalid (#15071)
|
5 сар өмнө |
Reese Levine
|
587d0118f5
ggml: WebGPU backend host improvements and style fixing (#14978)
|
5 сар өмнө |
Jeff Bolz
|
5aa1105da2
vulkan: fix build when using glslang that does not support coopmat2 (#15062)
|
5 сар өмнө |
compilade
|
d31192b4ee
imatrix : use GGUF by default (#14842)
|
5 сар өмнө |
compilade
|
0a2f5496be
imatrix : fix 3d activation handling for hybrid and recurrent models (#14994)
|
5 сар өмнө |
compilade
|
11a3811164
memory : handle kv_unified for hybrid models (#15050)
|
5 сар өмнө |
Csaba Kecskemeti
|
97366dc6ab
vocab : JetBrains Mellum pre-tokenizer (#15045)
|
5 сар өмнө |
Gabriel Larson
|
83bc2f288c
model : add text-only support for Kimi-VL (and find special tokens in text_config) (#15051)
|
5 сар өмнө |
Jeff Bolz
|
6c7a441161
vulkan: Use coopmat2 for conv2d (#14982)
|
5 сар өмнө |
lhez
|
5c0eb5ef54
opencl: fix adreno compiler detection logic (#15029)
|
5 сар өмнө |
Johannes Gäßler
|
03d4698218
CUDA: use mma FA kernel for gqa > 4 on RTX 4000 (#15035)
|
5 сар өмнө |
leejet
|
3303c19b16
cuda: make im2col a little faster (#15025)
|
5 сар өмнө |
Daniel Bevenius
|
4fdea540bd
kv-cache : skip alignment of n_stream in kv-cache log msg [no ci] (#15040)
|
5 сар өмнө |
Georgi Gerganov
|
a4569c41fd
llama : enable LLAMA_SET_ROWS=1 by default (#14959)
|
5 сар өмнө |
Georgi Gerganov
|
15e92fd337
cuda, sycl : fix batched gemm when ne02 == 1 && ne03 > 1 (#15038)
|
5 сар өмнө |
Sigbjørn Skjæret
|
2bf3fbf0b5
ci : check that pre-tokenizer hashes are up-to-date (#15032)
|
5 сар өмнө |
Douglas Hanley
|
711d5e6fe6
convert : fix Qwen3-Embedding pre-tokenizer hash (#15030)
|
5 сар өмнө |