Romain Biessy
|
9012eb9b45
sycl: Add more debug prints (#13640)
|
8 kuukautta sitten |
Jeff Bolz
|
fef693dc6b
vulkan: mark IM2COL as supporting non-contig (#13783)
|
8 kuukautta sitten |
Bizhao Shi
|
2d38b6e400
CANN: Add the basic supports of Flash Attention kernel (#13627)
|
8 kuukautta sitten |
Olivier Chafik
|
e121edc432
`server`: add `--reasoning-budget 0` to disable thinking (incl. qwen3 w/ enable_thinking:false) (#13771)
|
8 kuukautta sitten |
Xuan-Son Nguyen
|
2f099b510f
webui : bump max upload file size to 500MB (#13779)
|
8 kuukautta sitten |
Sigbjørn Skjæret
|
aa50ba462f
tests : improve UGM tokenizer test coverage (#13773)
|
8 kuukautta sitten |
Georgi Gerganov
|
de2ef53a4b
kv-cache : rework kv_cell (#13706)
|
8 kuukautta sitten |
Percy Piper
|
c508256db2
rpc : Fix build on OpenBSD (#13541)
|
8 kuukautta sitten |
Xuan-Son Nguyen
|
40aaa8a403
mtmd : add support for Qwen2-Audio and SeaLLM-Audio (#13760)
|
8 kuukautta sitten |
ddpasa
|
a08c1d2845
docs : add Moondream2 pre-quantized link (#13745)
|
8 kuukautta sitten |
Olivier Chafik
|
d785f9c1fd
server: fix/test add_generation_prompt (#13770)
|
8 kuukautta sitten |
Piotr Jasiukajtis
|
4032ca4066
llama : add support for Qwen3 MoE tied word embeddings (#13768)
|
8 kuukautta sitten |
Akarshan Biswas
|
515fdbf7ed
SYCL: revert "sycl: simplify bin_bcast_kernel (#13383)" (#13752)
|
8 kuukautta sitten |
Olivier Chafik
|
f5cd27b71d
`server`: streaming of tool calls and thoughts when `--jinja` is on (#12379)
|
8 kuukautta sitten |
Diego Devesa
|
a2d02d5793
releases : bundle llvm omp library in windows release (#13763)
|
8 kuukautta sitten |
Diego Devesa
|
17fc817b58
releases : enable openmp in windows cpu backend build (#13756)
|
8 kuukautta sitten |
Diego Devesa
|
2bd1b30f69
ggml-cpu : set openmp wait time if not set (#13758)
|
8 kuukautta sitten |
0cc4m
|
259469c4b5
Move GLM4 f32 attention fix to the correct function (#13750)
|
8 kuukautta sitten |
Xuan-Son Nguyen
|
4c32832c59
ggml : add ggml_gelu_erf() CUDA kernel (#13719)
|
8 kuukautta sitten |
Sigbjørn Skjæret
|
c3a2624339
vocab : fix ugm tokenizer precision (#13743)
|
8 kuukautta sitten |
Johannes Gäßler
|
ffd0eae60b
CUDA: fix race condition in FA vector kernels (#13742)
|
8 kuukautta sitten |
Diego Devesa
|
b775345d78
ci : enable winget package updates (#13734)
|
8 kuukautta sitten |
Diego Devesa
|
a70a8a69c2
ci : add winget package updater (#13732)
|
8 kuukautta sitten |
Georgi Gerganov
|
d13d0f6135
hparams : initialize arrays (#13728)
|
8 kuukautta sitten |
Xuan-Son Nguyen
|
8a2afb7520
llama : allow custom list of swa_layers (#13726)
|
8 kuukautta sitten |
Xuan-Son Nguyen
|
9ecf3e66a3
server : support audio input (#13714)
|
8 kuukautta sitten |
Chenguang Li
|
faaaff5f94
CANN: Support MUL_MAT_ID for q8_0 and q4_0 (#13705)
|
8 kuukautta sitten |
Xuan-Son Nguyen
|
e16c4731c7
ggml : fix the order of ggml_unary_op (#13718)
|
8 kuukautta sitten |
Jeff Bolz
|
1dcd01960c
vulkan: support CPY from any type to itself (#13695)
|
8 kuukautta sitten |
Jeff Bolz
|
c10ed6cbcc
vulkan: Disable coopmat/coopmat2/bfloat extensions if glslc doesn't support it (#13696)
|
8 kuukautta sitten |