theo77186
|
622cd010ff
ggml: CUDA: add head size 72 for flash-attn (#16962)
|
2 месяцев назад |
Xuan-Son Nguyen
|
070ff4d535
mtmd: add --image-min/max-tokens (#16921)
|
2 месяцев назад |
Xuan-Son Nguyen
|
bf7b0c9725
mtmd: pad mask for qwen2.5vl (#16954)
|
2 месяцев назад |
Jinyang He
|
fcfce040e8
ggml : LoongArch fixes (#16958)
|
2 месяцев назад |
Olivier Chafik
|
ee3a5a10ad
sync: minja (glm 4.6 & minmax m2 templates) (#16949)
|
2 месяцев назад |
shani-f
|
7e994168b1
SYCL: optimized repeat_back kernel (3× fewer asm instructions, 2× faster)Feature/sycl repeat back opt (#16869)
|
2 месяцев назад |
Sascha Rogmann
|
bcfa87622a
feat(webui): improve LaTeX rendering with currency detection (#16508)
|
2 месяцев назад |
Shagun Bera
|
a2054e3a8f
test-backend-ops : fix segfault in moe-expert-reduce test in support mode and coverage (#16936)
|
2 месяцев назад |
Sigbjørn Skjæret
|
dd52868050
ci : disable failing riscv cross build (#16952)
|
2 месяцев назад |
Zhiyong Wang
|
6b9a52422b
model: add Janus Pro for image understanding (#16906)
|
2 месяцев назад |
Georgi Gerganov
|
2f966b8ed8
clip : use FA (#16837)
|
2 месяцев назад |
Georgi Gerganov
|
cd5e3b5754
server : support unified cache across slots (#16736)
|
2 месяцев назад |
Aldehir Rojas
|
87c9efc3b2
common : move gpt-oss reasoning processing to init params (#16937)
|
2 месяцев назад |
Adrian Lundberg
|
76af40aaaa
docs: remove llama_sampler_accept reference in sampling sample usage (#16920)
|
2 месяцев назад |
mnehete32
|
7db35a7958
CUDA: add FLOOR, CEIL, ROUND, TRUNC unary ops (#16917)
|
2 месяцев назад |
Aaron Teo
|
a864132ba5
devops: fix failing s390x docker build (#16918)
|
2 месяцев назад |
Aaron Teo
|
d38d9f0877
ggml: add s390x cpu-feats (#16774)
|
2 месяцев назад |
Georgi Gerganov
|
7fd205a8e8
scripts : add script to bench models (#16894)
|
2 месяцев назад |
Pascal
|
2f68ce7cfd
webui: auto-refresh /props on inference start to resync model metadata (#16784)
|
2 месяцев назад |
Pascal
|
e4a71599e5
webui: add HTML/JS preview support to MarkdownContent with sandboxed iframe (#16757)
|
2 месяцев назад |
Adrien Gallouët
|
dd5e8cab51
vendor : update cpp-httplib to 0.27.0 (#16846)
|
2 месяцев назад |
Xuan-Son Nguyen
|
cf659bbb8e
mtmd: refactor preprocessing + support max/min pixels (#16878)
|
2 месяцев назад |
Aleksander Grygier
|
d8b860a219
Add a setting to display message generation statistics (#16901)
|
2 месяцев назад |
Jaromír Hradílek
|
1ae74882f8
webui: recognize AsciiDoc files as valid text files (#16850)
|
2 месяцев назад |
Sigbjørn Skjæret
|
961660b8c3
common : allow --system-prompt-file for diffusion-cli (#16903)
|
2 месяцев назад |
Sigbjørn Skjæret
|
74fef4129f
codeowners : update after refactor (#16905)
|
2 месяцев назад |
Jeff Bolz
|
5d8bb900bc
vulkan: Fix multi_add invalid descriptor usage (#16899)
|
2 месяцев назад |
Jeff Bolz
|
2e76e01360
vulkan: fuse mul_mat+add and mul_mat_id+add_id (#16868)
|
2 месяцев назад |
Oliver Simons
|
d3dc9dd898
CUDA: Remove unneded bias/gate dims in fused mmvq (#16858)
|
2 месяцев назад |
Piotr Wilkin (ilintar)
|
bea04522ff
refactor : llama-model.cpp (#16252)
|
2 месяцев назад |