matteo
|
ced44be342
llama-chat : fix wrong template in GLM4-0414 (#13140)
|
8 месяцев назад |
R0CKSTAR
|
e291450b76
musa: fix build warning (#13129)
|
8 месяцев назад |
LostRuins Concedo
|
59e991c23c
Fixes Qwen2.5VL segfault during inference with https://github.com/ggml-org/llama.cpp/pull/12402 as has_qwen2vl_merger migration was incomplete (#13133)
|
8 месяцев назад |
HimariO
|
ca2bb89eac
clip : Add Qwen2.5VL support (#12402)
|
8 месяцев назад |
Xuan-Son Nguyen
|
2d451c8059
common : add common_remote_get_content (#13123)
|
8 месяцев назад |
Xuan-Son Nguyen
|
4753791e70
clip : improve projector naming (#13118)
|
8 месяцев назад |
SXX
|
77d5e9a76a
ggml: move fp16/bf16 conversion optimizations to CPU backend + export conversion APIs (#13107)
|
8 месяцев назад |
frob
|
d5fe4e81bd
grammar : handle maxItems == 0 in JSON schema (#13117)
|
8 месяцев назад |
Diego Devesa
|
295354ea68
llama : fix K-shift with quantized K and BLAS backend (#13113)
|
8 месяцев назад |
City
|
558a764713
Force FP32 compute in GLM4 FFN Down (#13101)
|
9 месяцев назад |
Xuan-Son Nguyen
|
edb18b6e8f
clip : fix pixtral on some GPU backends (#13097)
|
9 месяцев назад |
Neo Zhang Jianyu
|
514c45608f
change the reorder tensor from init to execute OP (#13003)
|
9 месяцев назад |
Radoslav Gerganov
|
553a5c3a9f
rpc : do not wait for response when sending RPC_CMD_SET_TENSOR (#12943)
|
9 месяцев назад |
Xuan-Son Nguyen
|
13be08daf9
clip : remove boi/eoi embeddings for GLM-edge model (#13081)
|
9 месяцев назад |
Georgi Gerganov
|
226251ed56
embeddings : fix batch sizes (#13076)
|
9 месяцев назад |
Georgi Gerganov
|
87616f0680
ggml : fix trailing whitespaces (#0)
|
9 месяцев назад |
Georgi Gerganov
|
63b4911494
sync : ggml
|
9 месяцев назад |
Acly
|
c6e8cc28c1
ggml : Depthwise 2D convolution (ggml/1152)
|
9 месяцев назад |
Johannes Gäßler
|
b10d8bfdb1
CUDA: use switch statements in constexpr functions (#13095)
|
9 месяцев назад |
Georgi Gerganov
|
13b4548877
cmake : do not include ./src as public for libllama (#13062)
|
9 месяцев назад |
Georgi Gerganov
|
572b3141d3
clang-tidy : disable warning about missing math parenthesis (#13091)
|
9 месяцев назад |
Xuan-Son Nguyen
|
7c727fbe39
arg : add --no-mmproj-offload (#13093)
|
9 месяцев назад |
Xuan-Son Nguyen
|
80982e815e
arg : clean up handling --mmproj with -hf (#13082)
|
9 месяцев назад |
Georgi Gerganov
|
7604a7d6b8
metal : fix floating-point range of attention scores in FA kernels (#13090)
|
9 месяцев назад |
Eve
|
b3b6d862cf
vulkan: matmul gcn tuning (#13016)
|
9 месяцев назад |
pl752
|
5630406959
llama-mtmd-cli: Sigint rework in mtmd vision example (#13080)
|
9 месяцев назад |
Xuan-Son Nguyen
|
ecda2ec4b3
mtmd : Support Pixtral 12B (#13065)
|
9 месяцев назад |
piDack
|
eb1776b15a
convert : Append mult-eos,half-rope,bos to GLM4-0414 and Z (#13021)
|
9 месяцев назад |
Radoslav Gerganov
|
2cca6c01e4
rpc : add command line option for number of threads for the CPU backend (#13060)
|
9 месяцев назад |
Johannes Gäßler
|
658987cfc9
CUDA: noncont MMVQ + batched bs1 MUL_MAT_ID (#13014)
|
9 месяцев назад |