Henry Linjamäki
|
a4e8912dfd
opencl: Add support for multiple devices (#12622)
|
8 ヶ月 前 |
Henry Linjamäki
|
edbf42edfd
opencl: fix couple crashes (#12795)
|
8 ヶ月 前 |
Diego Devesa
|
d643bb2c79
releases : build CPU backend separately (windows) (#13642)
|
8 ヶ月 前 |
Georgi Gerganov
|
8e186ef0e7
hparams : support models for which all layers use SWA (#13682)
|
8 ヶ月 前 |
Georgi Gerganov
|
5fbfe384d4
server : improve error reporting (#13680)
|
8 ヶ月 前 |
antichristHater
|
c76532e7ba
convert : add qwen2vl support for unsloth merges (#13686)
|
8 ヶ月 前 |
Sigbjørn Skjæret
|
2aa777d86d
examples : switch retrieval to llama_encode (#13685)
|
8 ヶ月 前 |
Emmanuel Ferdman
|
eb0f5c28d3
gguf-py : display the invalid gguf type (#13687)
|
8 ヶ月 前 |
Xuan-Son Nguyen
|
cf4cb59e64
ggml : add ggml_gelu_erf() (#13667)
|
8 ヶ月 前 |
Robin Davidsson
|
0d5c742161
server : Add the endpoints /api/tags and /api/chat (#13659)
|
8 ヶ月 前 |
Dorin-Andrei Geman
|
42158ae2e8
server : fix first message identification (#13634)
|
8 ヶ月 前 |
Georgi Gerganov
|
797f2ac062
kv-cache : simplify the interface (#13660)
|
8 ヶ月 前 |
Georgi Gerganov
|
b44890df2e
model : disable SWA for Phi models (#13676)
|
8 ヶ月 前 |
R0CKSTAR
|
33983057d0
musa: Upgrade MUSA SDK version to rc4.0.1 and use mudnn::Unary::IDENTITY op to accelerate D2D memory copy (#13647)
|
8 ヶ月 前 |
Eve
|
fb1cab201c
vulkan: fix warnings (#13626)
|
8 ヶ月 前 |
l3utterfly
|
b7a17463ec
mtmd-helper : bug fix to token batching in mtmd (#13650)
|
8 ヶ月 前 |
Georgi Gerganov
|
be0239693c
model : fix llama4 graph (#13663)
|
8 ヶ月 前 |
Georgi Gerganov
|
a4090d1174
llama : remove llama_kv_cache_view API + remove deprecated (#13653)
|
8 ヶ月 前 |
Johannes Gäßler
|
b69f1647f9
CUDA: skip fully masked-out KV in FA vec kernel (#13584)
|
8 ヶ月 前 |
Sigbjørn Skjæret
|
759e37b0d8
tests : avoid github urls due to throttling (#13654)
|
8 ヶ月 前 |
Svetlozar Georgiev
|
4245e622e0
sycl: disable reorder for sycl mulmat (#13536)
|
8 ヶ月 前 |
0cc4m
|
c9c64dee57
Set GLM4 blk.*.attn_output.weight, kqv_out-* matmul to GGML_PREC_F32 to fix infinity values in output (#13639)
|
8 ヶ月 前 |
Georgi Gerganov
|
c00a2634be
metal : fix typo in FA kernel comments (#13651)
|
8 ヶ月 前 |
Georgi Gerganov
|
e298d2fbd0
kv-cache : add SWA support (#13194)
|
8 ヶ月 前 |
Xinpeng Dou
|
f0adb80bf7
CANN: Update CANN model support (#13162)
|
8 ヶ月 前 |
Nicolò Scipione
|
f7c9429c85
sycl : Overcoming workaround for mmap() allocation on Windows (#13482)
|
8 ヶ月 前 |
psocolovsky
|
1dfbf2cf3a
common : add load_progress_callback (#13617)
|
8 ヶ月 前 |
0cc4m
|
8960efd0a6
Vulkan: Add f32 accumulator support to quantized mul mat to fix GLM4 32B incoherence (#13607)
|
8 ヶ月 前 |
Alberto Cabrera Pérez
|
725f23f1f3
sycl : backend documentation review (#13544)
|
8 ヶ月 前 |
Xuan-Son Nguyen
|
92ecdcc06a
mtmd : add vision support for llama 4 (#13282)
|
8 ヶ月 前 |