Chad Voegele
|
c4357dcc35
Server: Change Invalid Schema from Server Error (500) to User Error (400) (#17572)
|
1 month ago |
Adrien Gallouët
|
e148380c7c
ggml : use svcntb() for SVE vector length detection (#17474)
|
1 month ago |
TianHao324
|
a2b0fe8d37
CANN: Disable Ger operator of OUT_PROD on 310p device (#17563)
|
1 month ago |
Daniel Bevenius
|
7f3a72a8ed
ggml : remove redundant n_copies check when setting input/output (#17612)
|
1 month ago |
Eric Curtin
|
b9a37717b0
codeowners : remove ericcurtin (#17658)
|
1 month ago |
Adrien Gallouët
|
f3a9674ae8
llama : fix signed comparison warning on FreeBSD (#17497)
|
1 month ago |
Xuan-Son Nguyen
|
2c453c6c77
convert: add error message for mistral3 quantized weight (#17686)
|
1 month ago |
Xuan-Son Nguyen
|
5d6bd842ea
server: remove default "gpt-3.5-turbo" model name (#17668)
|
1 month ago |
senhtry
|
fd3abe849e
server: fixing naming conflict res_error in server-models.cpp (#17679)
|
1 month ago |
Xuan-Son Nguyen
|
682e6658bb
server: explicitly set exec path when create new instance (#17669)
|
1 month ago |
Adrien Gallouët
|
4574f2949e
ci : skip winget update when not in ggml-org (#17465)
|
1 month ago |
Adrien Gallouët
|
ab6726eeff
ggml : add fallback definition for HWCAP2_SVE2 (#17683)
|
1 month ago |
Aleksander Grygier
|
cee92af553
Add context info to server error (#17663)
|
1 month ago |
Aman Gupta
|
ed32089927
ggml-cuda: reorder only relevant nodes (#17639)
|
1 month ago |
Aaron Teo
|
7b6d745364
release: fix duplicate libs, store symbolic links (#17299)
|
1 month ago |
Neo Zhang Jianyu
|
98bd9ab1e4
enhance argsort for UT (#17573)
|
1 month ago |
Piotr Wilkin (ilintar)
|
746f9ee889
Override SSM_A op for Qwen3 Next to reduce splits (#17587)
|
1 month ago |
Jeff Bolz
|
9810cb8247
ops.md: update vulkan support (#17661)
|
1 month ago |
Xuan-Son Nguyen
|
ecf74a8417
mtmd: add mtmd_context_params::warmup option (#17652)
|
1 month ago |
Gilad S.
|
00c361fe53
fix: llama arch implementation (#17665)
|
1 month ago |
Xuan-Son Nguyen
|
ec18edfcba
server: introduce API for serving / loading / unloading multiple models (#17470)
|
1 month ago |
Xuan-Son Nguyen
|
7733409734
common: improve verbosity level definitions (#17630)
|
1 month ago |
Xuan-Son Nguyen
|
cd3c118908
model: support Ministral3 (#17644)
|
1 month ago |
Georgi Gerganov
|
649495c9d9
metal : add FA head size 48 (#17619)
|
1 month ago |
Georgi Gerganov
|
90c72a614a
ggml : extend the GGML_SCHED_NO_REALLOC debug logic of the scheduler (#17617)
|
1 month ago |
Aman Gupta
|
6eea666912
llama-graph: avoid expand_forward for fusion (#17633)
|
1 month ago |
Xuan-Son Nguyen
|
ff90508d68
contributing: update guidelines for AI-generated code (#17625)
|
1 month ago |
Adrien Gallouët
|
0a4aeb927d
cmake : add option to build and link LibreSSL (#17552)
|
1 month ago |
Tarek Dakhran
|
2ba719519d
model: LFM2-VL fixes (#17577)
|
1 month ago |
Xuan-Son Nguyen
|
7f8ef50cce
clip: fix nb calculation for qwen3-vl (#17594)
|
1 month ago |