Georgi Gerganov
|
cd5e3b5754
server : support unified cache across slots (#16736)
|
2 ヶ月 前 |
Aldehir Rojas
|
87c9efc3b2
common : move gpt-oss reasoning processing to init params (#16937)
|
2 ヶ月 前 |
Adrian Lundberg
|
76af40aaaa
docs: remove llama_sampler_accept reference in sampling sample usage (#16920)
|
2 ヶ月 前 |
mnehete32
|
7db35a7958
CUDA: add FLOOR, CEIL, ROUND, TRUNC unary ops (#16917)
|
2 ヶ月 前 |
Aaron Teo
|
a864132ba5
devops: fix failing s390x docker build (#16918)
|
2 ヶ月 前 |
Aaron Teo
|
d38d9f0877
ggml: add s390x cpu-feats (#16774)
|
2 ヶ月 前 |
Georgi Gerganov
|
7fd205a8e8
scripts : add script to bench models (#16894)
|
2 ヶ月 前 |
Pascal
|
2f68ce7cfd
webui: auto-refresh /props on inference start to resync model metadata (#16784)
|
2 ヶ月 前 |
Pascal
|
e4a71599e5
webui: add HTML/JS preview support to MarkdownContent with sandboxed iframe (#16757)
|
2 ヶ月 前 |
Adrien Gallouët
|
dd5e8cab51
vendor : update cpp-httplib to 0.27.0 (#16846)
|
2 ヶ月 前 |
Xuan-Son Nguyen
|
cf659bbb8e
mtmd: refactor preprocessing + support max/min pixels (#16878)
|
2 ヶ月 前 |
Aleksander Grygier
|
d8b860a219
Add a setting to display message generation statistics (#16901)
|
2 ヶ月 前 |
Jaromír Hradílek
|
1ae74882f8
webui: recognize AsciiDoc files as valid text files (#16850)
|
2 ヶ月 前 |
Sigbjørn Skjæret
|
961660b8c3
common : allow --system-prompt-file for diffusion-cli (#16903)
|
2 ヶ月 前 |
Sigbjørn Skjæret
|
74fef4129f
codeowners : update after refactor (#16905)
|
2 ヶ月 前 |
Jeff Bolz
|
5d8bb900bc
vulkan: Fix multi_add invalid descriptor usage (#16899)
|
2 ヶ月 前 |
Jeff Bolz
|
2e76e01360
vulkan: fuse mul_mat+add and mul_mat_id+add_id (#16868)
|
2 ヶ月 前 |
Oliver Simons
|
d3dc9dd898
CUDA: Remove unneded bias/gate dims in fused mmvq (#16858)
|
2 ヶ月 前 |
Piotr Wilkin (ilintar)
|
bea04522ff
refactor : llama-model.cpp (#16252)
|
2 ヶ月 前 |
Piotr Wilkin (ilintar)
|
0de0a01576
model : Minimax M2 (#16831)
|
2 ヶ月 前 |
Giuseppe Scrivano
|
e58d585604
model : add Granite Hybrid nano types (#16896)
|
2 ヶ月 前 |
Johannes Gäßler
|
31c511a968
CUDA: Volta tensor core support for MMF (#16843)
|
2 ヶ月 前 |
Georgi Gerganov
|
6d39015a74
sync : ggml
|
2 ヶ月 前 |
Aman Gupta
|
4146d6a1a6
CUDA: add expert reduce kernel (#16857)
|
2 ヶ月 前 |
Georgi Gerganov
|
8da3c0e200
batch : fix consistency checks for the input positions (#16890)
|
2 ヶ月 前 |
Georgi Gerganov
|
c22473b580
server : don't print user inputs to console (#16871)
|
2 ヶ月 前 |
Daniel Bevenius
|
0f715b4e75
server : fix typos in server.cpp comments [no ci] (#16883)
|
2 ヶ月 前 |
Jeff Bolz
|
d2d931f173
vulkan: disable spirv-opt for rope shaders (#16872)
|
2 ヶ月 前 |
Masato Nakasaka
|
2976b0374d
vulkan: Fix crash when FP16 mul_mat accumulation is not supported (#16796)
|
2 ヶ月 前 |
Ruben Ortlam
|
d2a2673dd1
vulkan: fix shmem overrun in mmq id shader (#16873)
|
2 ヶ月 前 |