Akarshan Biswas
|
1e333d5bba
SYCL: Disable reorder optimize by default and stop setting tensor extras when optimize is disabled (#13254)
|
8 месяцев назад |
Xuan-Son Nguyen
|
2f54e348ad
llama : fix build_ffn without gate (#13336)
|
8 месяцев назад |
Johannes Gäßler
|
2356fb1d53
CUDA: fix bad asserts for partial offload (#13337)
|
8 месяцев назад |
Sigbjørn Skjæret
|
764b85627b
convert : qwen2/3moe : set yarn metadata if present (#13331)
|
8 месяцев назад |
Johannes Gäßler
|
15a28ec8c7
CUDA: fix --split-mode row for MMQ (#13323)
|
8 месяцев назад |
compilade
|
a7366faa5b
gguf-py : avoid requiring pyside6 for other scripts (#13036)
|
8 месяцев назад |
Johannes Gäßler
|
9070365020
CUDA: fix logic for clearing padding with -ngl 0 (#13320)
|
8 месяцев назад |
oobabooga
|
233461f812
sampling : Integrate Top-nσ into main sampling chain (and add it to the server) (#13264)
|
8 месяцев назад |
igardev
|
b34c859146
server : Webui - change setText command from parent window to also send the message. (#13309)
|
8 месяцев назад |
Xuan-Son Nguyen
|
9b61acf060
mtmd : rename llava directory to mtmd (#13311)
|
8 месяцев назад |
Xuan-Son Nguyen
|
5215b91e93
clip : fix confused naming ffn_up and ffn_down (#13290)
|
8 месяцев назад |
Sigbjørn Skjæret
|
ae803bfc3d
convert : bailingmoe : set yarn metadata if present (#13312)
|
8 месяцев назад |
Akarshan Biswas
|
66645a5285
SYCL: Disable mul_mat kernels for noncontiguous tensor b (#13308)
|
8 месяцев назад |
Xuan-Son Nguyen
|
27aa259532
mtmd : add C public API (#13184)
|
8 месяцев назад |
Diego Devesa
|
9fdfcdaedd
rpc : use backend registry, support dl backends (#13304)
|
8 месяцев назад |
Aaron Teo
|
6eb7d25c70
ggml : activate s390x simd for Q3_K (#13301)
|
8 месяцев назад |
Diego Devesa
|
86bd60d3fe
llava/mtmd : fixes to fully support dl backends (#13303)
|
8 месяцев назад |
Diego Devesa
|
9f2da5871f
llama : build windows releases with dl backends (#13220)
|
8 месяцев назад |
Johannes Gäßler
|
93c4e23905
CUDA: fix race condition in MMQ stream-k fixup (#13299)
|
8 месяцев назад |
Johannes Gäßler
|
8afbd96818
CUDA: fix race condition in MMQ ids_dst (#13294)
|
8 месяцев назад |
Jeff Bolz
|
8ae5ebcf85
vulkan: Additional type support for unary, binary, and copy (#13266)
|
8 месяцев назад |
Johannes Gäßler
|
3e959f0976
imatrix: fix oob writes if src1 is not contiguous (#13286)
|
8 месяцев назад |
Xuan-Son Nguyen
|
36667c8edc
clip : revert the change of BOI/EOI token for GLM-edge (⚠️ breaking change) (#13259)
|
8 месяцев назад |
ymcki
|
3bf785f3ef
llama : Llama-3_1-Nemotron-Ultra-253B-v1 support (#12843)
|
8 месяцев назад |
Diego Devesa
|
1d36b3670b
llama : move end-user examples to tools directory (#13249)
|
8 месяцев назад |
Georgi Gerganov
|
b34443923c
sync : ggml (#13268)
|
8 месяцев назад |
Georgi Gerganov
|
a75cb30dc9
context : fix reorder logic (#13267)
|
8 месяцев назад |
shalinib-ibm
|
3f3769ba76
ggml : Enable MMA for BF16 in llamafile_sgemm (#13148)
|
8 месяцев назад |
Jared Van Bortel
|
2f567611c0
llama-model : support Qwen2 embedding models and pooling_mode_lasttoken (#13245)
|
8 месяцев назад |
Jared Van Bortel
|
7d2123484e
convert : use correct context length for nomic-embed-text-v2 (#13216)
|
8 месяцев назад |