Xuan-Son Nguyen
|
3f96aeff39
llama : one-off chat template fix for Mistral-Small-2503 (#13398)
|
8 months ago |
Radoslav Gerganov
|
b486ba05bf
rpc : add rpc_msg_set_tensor_hash_req (#13353)
|
8 months ago |
Jeff Bolz
|
02115dcd9a
vulkan: Allow up to 4096 elements for mul_mat_id row_ids (#13326)
|
8 months ago |
Xuan-Son Nguyen
|
d9c4accaff
server : (webui) rename has_multimodal --> modalities (#13393)
|
8 months ago |
Diego Devesa
|
15e03282bb
ci : limit write permission to only the release step + fixes (#13392)
|
8 months ago |
Matt Clayton
|
f05a6d71a0
mtmd : Expose helper_decode_image_chunk (#13366)
|
8 months ago |
Xuan-Son Nguyen
|
ee01d71e58
server : (webui) fix a very small misalignment (#13387)
|
8 months ago |
Xuan-Son Nguyen
|
8c83449cb7
server : (webui) revamp the input area, plus many small UI improvements (#13365)
|
8 months ago |
Sigbjørn Skjæret
|
1a844be132
convert : support rope_scaling type and rope_type (#13349)
|
8 months ago |
welix
|
0ccc121354
mtmd : fix the calculation of n_tokens for smolvlm (#13381)
|
8 months ago |
Georgi Gerganov
|
6562e5a4d6
context : allow cache-less context for embeddings (#13108)
|
8 months ago |
Georgi Gerganov
|
51fb96b1ff
context : remove logits_all flag (#13284)
|
8 months ago |
Diego Devesa
|
70a6991edf
ci : move release workflow to a separate file (#13362)
|
8 months ago |
Diego Devesa
|
f061021206
llama : print size and type of overridden tensors (#13364)
|
8 months ago |
Alberto Cabrera Pérez
|
8733e0cf6e
sycl: addressing non-contiguous src1 mul_mats (nc and batched) (#13343)
|
8 months ago |
Diego Devesa
|
814f795e06
docker : disable arm64 and intel images (#13356)
|
8 months ago |
Georgi Gerganov
|
d879433824
sync : ggml
|
8 months ago |
Daniel Bevenius
|
13b0a04597
whisper: remove MSVC warnings pragmas (whisper/3090)
|
8 months ago |
Jared Tweed
|
bba9d945c1
cmake : removed stdc++fs (whisper/3097)
|
8 months ago |
Sigbjørn Skjæret
|
bc4e1128f7
llama : deci : support ffn-free with attention (#13296)
|
8 months ago |
Ycros
|
39e73ae0d6
common : Add a warning when we can't match samplers from a string or char. (#13330)
|
8 months ago |
R0CKSTAR
|
1f73301b63
cuda : remove nrows_x in mul_mat_q_process_tile (#13325)
|
8 months ago |
Georgi Gerganov
|
4773d7a02f
examples : remove infill (#13283)
|
8 months ago |
piDack
|
6c7fd67b64
llama : support tie embedding for chatglm models (#13328)
|
8 months ago |
Johannes Gäßler
|
141a908a59
CUDA: mix virt/real CUDA archs for GGML_NATIVE=OFF (#13135)
|
8 months ago |
Xuan-Son Nguyen
|
32916a4907
clip : refactor graph builder (#13321)
|
8 months ago |
DocShotgun
|
ffc727203a
sampling : make top_n_sigma no-op at <=0 or a single candidate (#13345)
|
8 months ago |
oobabooga
|
91a86a6f35
sampling : don't consider -infinity values in top_n_sigma (#13344)
|
8 months ago |
Diego Devesa
|
f4ed10b69c
cmake : remove arm64 msvc presets (#13342)
|
8 months ago |
Akarshan Biswas
|
1e333d5bba
SYCL: Disable reorder optimize by default and stop setting tensor extras when optimize is disabled (#13254)
|
8 months ago |