Jeff Bolz
|
bf69cfe62f
vulkan: fix bug in coopmat1 mul_mat_id (#12316)
|
10 mesi fa |
uvos
|
10f2e81809
CUDA/HIP: refractor mmqv to unify the calculation of nwarps and rows per block between host and device code. (#12177)
|
10 mesi fa |
jklincn
|
ba7654380a
ggml-backend : fix backend search path (#12330)
|
10 mesi fa |
BB-fat
|
6ab2e4765a
metal : Cache the Metal library at the device context level (#12265)
|
10 mesi fa |
Xuan-Son Nguyen
|
96e1280839
clip : bring back GPU support (#12322)
|
10 mesi fa |
Eve
|
2c9f833d17
mat vec double buffer (#12188)
|
10 mesi fa |
R0CKSTAR
|
251364549f
musa: support new arch mp_31 and update doc (#12296)
|
10 mesi fa |
Henry Linjamäki
|
8acdacb3ea
opencl: use OpenCL C standard supported by the device (#12221)
|
10 mesi fa |
John Bean
|
89b2b56e86
readme: added Sidekick to available UIs (#12311)
|
10 mesi fa |
Georgi Gerganov
|
e128a1bf5b
tests : fix test-quantize-fns to init the CPU backend (#12306)
|
10 mesi fa |
marcoStocchi
|
6ef79a67ca
common : refactor '-o' option (#12278)
|
10 mesi fa |
Olivier Chafik
|
4e39a3c332
`server`: extract <think> tags from qwq outputs (#12297)
|
10 mesi fa |
Olivier Chafik
|
be421fc429
`tool-call`: ensure there's always a non-empty tool call id (#12292)
|
10 mesi fa |
Olivier Chafik
|
87c2630546
allow missing content in message if tool_calls provided (#12293)
|
10 mesi fa |
Olivier Chafik
|
2b3a25c212
`sampler`: fixes trigger tokens + lazy grammars (fix typo cast from token to string) (#12291)
|
10 mesi fa |
tc-mb
|
8352cdc87b
llava : fix bug in minicpm-v code (#11513)
|
10 mesi fa |
Georgi Gerganov
|
1e2f78a004
server : add speculative decoding presets for FIM (#12287)
|
10 mesi fa |
Georgi Gerganov
|
0fd7ca7a21
authors : update (#12271)
|
10 mesi fa |
Jason C.H
|
6fefc05a7a
ggml-backend : make path_str compatible with C++20 (#12269)
|
10 mesi fa |
Georgi Gerganov
|
7ab364390f
server : infill gen ends on new line (#12254)
|
10 mesi fa |
Daniel Bevenius
|
7c7f3b7f43
ggml : skip intermediate .air file when compiling .metallib (#12247)
|
10 mesi fa |
Georgi Gerganov
|
102ac1891d
sync : ggml
|
10 mesi fa |
vmobilis
|
d6ae2fa061
ggml : ggml_compute_forward_concat() for arbitrary tensor type (ggml/1118)
|
10 mesi fa |
Rémy O
|
68d0027f3d
ggml-cpu: faster AVX2 variant for IQ1_M (#12216)
|
10 mesi fa |
Georgi Gerganov
|
ea002810a2
ci : fix save-load test invocations (#12245)
|
10 mesi fa |
Sigbjørn Skjæret
|
8fad3c7a7c
server : Log original chat template parsing error (#12233)
|
10 mesi fa |
Olivier Chafik
|
7cf64f6bee
sync: minja - support QwQ-32B (#12235)
|
10 mesi fa |
BB-fat
|
5e2d57b2b2
metal : simplify kernel arguments using a struct (#3229) (#12194)
|
10 mesi fa |
David Huang
|
f1648e91cf
HIP: fix rocWMMA build flags under Windows (#12230)
|
10 mesi fa |
Daniel Bevenius
|
d6c95b0740
metal : fix default.metallib build (#12224)
|
10 mesi fa |