Xuan-Son Nguyen
|
053367d149
mtmd : support InternVL 2.5 and 3 (#13422)
|
8 kuukautta sitten |
Johannes Gäßler
|
d8919424f1
CUDA: fix FlashAttention on Turing (#13415)
|
8 kuukautta sitten |
Xuan-Son Nguyen
|
7fef11766c
arg : add env var to control mmproj (#13416)
|
8 kuukautta sitten |
Jeff Bolz
|
dc1d2adfc0
vulkan: scalar flash attention implementation (#13324)
|
8 kuukautta sitten |
Helton Reis
|
7c28a74e07
chore(llguidance): use tagged version that does not break the build (#13413)
|
8 kuukautta sitten |
Xuan-Son Nguyen
|
33eff40240
server : vision support via libmtmd (#12898)
|
8 kuukautta sitten |
Alberto Cabrera Pérez
|
17512a94d6
sycl : implementation of reordered Q4_0 MMVQ for Intel GPUs (#12858)
|
8 kuukautta sitten |
Georgi Gerganov
|
611aa914ef
metal : optimize MoE for large batches (#13388)
|
8 kuukautta sitten |
Johannes Gäßler
|
0cf6725e9f
CUDA: FA support for Deepseek (Ampere or newer) (#13306)
|
8 kuukautta sitten |
Diego Devesa
|
27ebfcacba
llama : do not crash if there is no CPU backend (#13395)
|
8 kuukautta sitten |
Johannes Gäßler
|
5c86c9ed3e
CUDA: fix crash on large batch size for MoE models (#13384)
|
8 kuukautta sitten |
Bartowski
|
efb8b47eda
imatrix : Add --parse-special for enabling parsing of special tokens in imatrix calculation (#13389)
|
8 kuukautta sitten |
R0CKSTAR
|
0527771dd8
llama-run: add support for downloading models from ModelScope (#13370)
|
8 kuukautta sitten |
Xuan-Son Nguyen
|
2189fd3b63
mtmd : fix batch_view for m-rope (#13397)
|
8 kuukautta sitten |
Xuan-Son Nguyen
|
3f96aeff39
llama : one-off chat template fix for Mistral-Small-2503 (#13398)
|
8 kuukautta sitten |
Radoslav Gerganov
|
b486ba05bf
rpc : add rpc_msg_set_tensor_hash_req (#13353)
|
8 kuukautta sitten |
Jeff Bolz
|
02115dcd9a
vulkan: Allow up to 4096 elements for mul_mat_id row_ids (#13326)
|
8 kuukautta sitten |
Xuan-Son Nguyen
|
d9c4accaff
server : (webui) rename has_multimodal --> modalities (#13393)
|
8 kuukautta sitten |
Diego Devesa
|
15e03282bb
ci : limit write permission to only the release step + fixes (#13392)
|
8 kuukautta sitten |
Matt Clayton
|
f05a6d71a0
mtmd : Expose helper_decode_image_chunk (#13366)
|
8 kuukautta sitten |
Xuan-Son Nguyen
|
ee01d71e58
server : (webui) fix a very small misalignment (#13387)
|
8 kuukautta sitten |
Xuan-Son Nguyen
|
8c83449cb7
server : (webui) revamp the input area, plus many small UI improvements (#13365)
|
8 kuukautta sitten |
Sigbjørn Skjæret
|
1a844be132
convert : support rope_scaling type and rope_type (#13349)
|
8 kuukautta sitten |
welix
|
0ccc121354
mtmd : fix the calculation of n_tokens for smolvlm (#13381)
|
8 kuukautta sitten |
Georgi Gerganov
|
6562e5a4d6
context : allow cache-less context for embeddings (#13108)
|
8 kuukautta sitten |
Georgi Gerganov
|
51fb96b1ff
context : remove logits_all flag (#13284)
|
8 kuukautta sitten |
Diego Devesa
|
70a6991edf
ci : move release workflow to a separate file (#13362)
|
8 kuukautta sitten |
Diego Devesa
|
f061021206
llama : print size and type of overridden tensors (#13364)
|
8 kuukautta sitten |
Alberto Cabrera Pérez
|
8733e0cf6e
sycl: addressing non-contiguous src1 mul_mats (nc and batched) (#13343)
|
8 kuukautta sitten |
Diego Devesa
|
814f795e06
docker : disable arm64 and intel images (#13356)
|
8 kuukautta sitten |