Jeff Bolz
|
c10ed6cbcc
vulkan: Disable coopmat/coopmat2/bfloat extensions if glslc doesn't support it (#13696)
|
8 сар өмнө |
Judd
|
a127ff1780
use LOG_WARN to replace `std::cerr` (#13657)
|
8 сар өмнө |
Diego Devesa
|
3079e9ac8e
release : fix windows hip release (#13707)
|
8 сар өмнө |
Georgi Gerganov
|
8a1d206f1d
tts : fix n_ubatch + make WavTokenizer cache-less (#13713)
|
8 сар өмнө |
Xuan-Son Nguyen
|
797990c4bc
mtmd : add ultravox audio input (#13623)
|
8 сар өмнө |
Aaron Teo
|
ab86335760
common: Include torch package for s390x (#13699)
|
8 сар өмнө |
Georgi Gerganov
|
cc74d5be99
server : pad small embedding batches (#13692)
|
8 сар өмнө |
Sigbjørn Skjæret
|
5be24af73d
gguf-py : correct charsmap parameter typing (#13701)
|
8 сар өмнө |
Nicolò Scipione
|
d394a9aedc
sycl : Remove waits from function calls (#13702)
|
8 сар өмнө |
Ewan Crawford
|
6b56a64690
SYCL: Avoid using with SYCL-Graph for unsupported nodes (#13587)
|
8 сар өмнө |
Henry Linjamäki
|
a4e8912dfd
opencl: Add support for multiple devices (#12622)
|
8 сар өмнө |
Henry Linjamäki
|
edbf42edfd
opencl: fix couple crashes (#12795)
|
8 сар өмнө |
Diego Devesa
|
d643bb2c79
releases : build CPU backend separately (windows) (#13642)
|
8 сар өмнө |
Georgi Gerganov
|
8e186ef0e7
hparams : support models for which all layers use SWA (#13682)
|
8 сар өмнө |
Georgi Gerganov
|
5fbfe384d4
server : improve error reporting (#13680)
|
8 сар өмнө |
antichristHater
|
c76532e7ba
convert : add qwen2vl support for unsloth merges (#13686)
|
8 сар өмнө |
Sigbjørn Skjæret
|
2aa777d86d
examples : switch retrieval to llama_encode (#13685)
|
8 сар өмнө |
Emmanuel Ferdman
|
eb0f5c28d3
gguf-py : display the invalid gguf type (#13687)
|
8 сар өмнө |
Xuan-Son Nguyen
|
cf4cb59e64
ggml : add ggml_gelu_erf() (#13667)
|
8 сар өмнө |
Robin Davidsson
|
0d5c742161
server : Add the endpoints /api/tags and /api/chat (#13659)
|
8 сар өмнө |
Dorin-Andrei Geman
|
42158ae2e8
server : fix first message identification (#13634)
|
8 сар өмнө |
Georgi Gerganov
|
797f2ac062
kv-cache : simplify the interface (#13660)
|
8 сар өмнө |
Georgi Gerganov
|
b44890df2e
model : disable SWA for Phi models (#13676)
|
8 сар өмнө |
R0CKSTAR
|
33983057d0
musa: Upgrade MUSA SDK version to rc4.0.1 and use mudnn::Unary::IDENTITY op to accelerate D2D memory copy (#13647)
|
8 сар өмнө |
Eve
|
fb1cab201c
vulkan: fix warnings (#13626)
|
8 сар өмнө |
l3utterfly
|
b7a17463ec
mtmd-helper : bug fix to token batching in mtmd (#13650)
|
8 сар өмнө |
Georgi Gerganov
|
be0239693c
model : fix llama4 graph (#13663)
|
8 сар өмнө |
Georgi Gerganov
|
a4090d1174
llama : remove llama_kv_cache_view API + remove deprecated (#13653)
|
8 сар өмнө |
Johannes Gäßler
|
b69f1647f9
CUDA: skip fully masked-out KV in FA vec kernel (#13584)
|
8 сар өмнө |
Sigbjørn Skjæret
|
759e37b0d8
tests : avoid github urls due to throttling (#13654)
|
8 сар өмнө |