Sigbjørn Skjæret
|
e98b3692be
llama : set qwen3 model type sizes (#13175)
|
8 bulan lalu |
Xuan-Son Nguyen
|
b6ce7430b7
llama-graph : fix text position for mrope (#13159)
|
8 bulan lalu |
AT
|
5f5e39e1ba
model : Nomic Embed Text V2 with Mixture-of-Experts (MoE) architecture (#12466)
|
8 bulan lalu |
Xuan-Son Nguyen
|
eaea325324
clip : fix model size display (#13153)
|
8 bulan lalu |
Ville Vesilehto
|
43ddab6eee
fix(rpc): Improve input validation and error handling (#13069)
|
8 bulan lalu |
Vishal Agarwal
|
1831f538f7
llama-bench: add `-d` depth arg (#13096)
|
8 bulan lalu |
Xuan-Son Nguyen
|
4e87962e34
mtmd : fix glm-edge redundant token count (#13139)
|
8 bulan lalu |
pockers21
|
fb0471d175
context : do not clear output buffer on reserve (#13152)
|
8 bulan lalu |
Xuan-Son Nguyen
|
d2b2031e5f
llama : (mrope) allow using normal 1D position for text token (#13138)
|
8 bulan lalu |
Xuan-Son Nguyen
|
5fa9e63be8
clip : refactor set input for cgraph + fix qwen2.5vl input (#13136)
|
8 bulan lalu |
Akarshan Biswas
|
a4c340f974
SYCL: Add all missing unary kernels (#13074)
|
8 bulan lalu |
Georgi Gerganov
|
d0a417f3c7
readme : update hot topics (#13150)
|
8 bulan lalu |
Georgi Gerganov
|
43f2b07193
common : fix noreturn compile warning (#13151)
|
8 bulan lalu |
Xuan-Son Nguyen
|
e5d6c2554e
llama-chat : fix typo GML --> GLM (#13143)
|
8 bulan lalu |
R0CKSTAR
|
f0dd6a1926
musa: fix typo in cc control (#13144)
|
8 bulan lalu |
Johannes Gäßler
|
69699be48a
CUDA: fix q_nope_absorbed prec for DS 2 Lite f16 (#13137)
|
8 bulan lalu |
Xuan-Son Nguyen
|
85f36e5e71
arg : fix unused variable (#13142)
|
8 bulan lalu |
4onen
|
c0a97b762e
llama-bench : Add `--override-tensors` arg (#12922)
|
8 bulan lalu |
matteo
|
ced44be342
llama-chat : fix wrong template in GLM4-0414 (#13140)
|
8 bulan lalu |
R0CKSTAR
|
e291450b76
musa: fix build warning (#13129)
|
8 bulan lalu |
LostRuins Concedo
|
59e991c23c
Fixes Qwen2.5VL segfault during inference with https://github.com/ggml-org/llama.cpp/pull/12402 as has_qwen2vl_merger migration was incomplete (#13133)
|
8 bulan lalu |
HimariO
|
ca2bb89eac
clip : Add Qwen2.5VL support (#12402)
|
8 bulan lalu |
Xuan-Son Nguyen
|
2d451c8059
common : add common_remote_get_content (#13123)
|
8 bulan lalu |
Xuan-Son Nguyen
|
4753791e70
clip : improve projector naming (#13118)
|
8 bulan lalu |
SXX
|
77d5e9a76a
ggml: move fp16/bf16 conversion optimizations to CPU backend + export conversion APIs (#13107)
|
8 bulan lalu |
frob
|
d5fe4e81bd
grammar : handle maxItems == 0 in JSON schema (#13117)
|
8 bulan lalu |
Diego Devesa
|
295354ea68
llama : fix K-shift with quantized K and BLAS backend (#13113)
|
8 bulan lalu |
City
|
558a764713
Force FP32 compute in GLM4 FFN Down (#13101)
|
8 bulan lalu |
Xuan-Son Nguyen
|
edb18b6e8f
clip : fix pixtral on some GPU backends (#13097)
|
8 bulan lalu |
Neo Zhang Jianyu
|
514c45608f
change the reorder tensor from init to execute OP (#13003)
|
8 bulan lalu |