Daniel Bevenius
|
2758fa10da
examples : add model conversion tool/example (#15455)
|
5 months ago |
Michael Giba
|
b108e42904
ci : fix -Werror=return-type in clip.cpp so ci/run.sh can run without issue (#15221)
|
5 months ago |
Copilot
|
245be739df
ci : add copilot-instructions.md (#15286)
|
5 months ago |
Julien Denize
|
b2caf67db1
convert : make Mistral community chat templates optional via parameter (#15420)
|
5 months ago |
Jie Fu (傅杰)
|
2f3dbffb17
common : fix incorrect print of non-ascii characters in the logging (#15466)
|
5 months ago |
Xuan-Son Nguyen
|
945e1f12a6
ggml : fix condition of im2col on Metal backend (#15460)
|
5 months ago |
stduhpf
|
1b0db8f6e0
server : fix webui (#15462)
|
5 months ago |
Daniel Bevenius
|
29f538ac63
examples : remove references to `make` in examples [no ci] (#15457)
|
5 months ago |
R0CKSTAR
|
8ad038c0fd
musa: add GGML_UNUSED_VARS (#15446)
|
5 months ago |
Diego Devesa
|
5682a3745f
sched : copy only the used experts when offloading prompt processing (#15346)
|
5 months ago |
teo
|
1bc664a26a
server: fix OpenAI API compatibility for usage statistics in chat streams (#15444)
|
5 months ago |
Johannes Gäßler
|
13aeb7aef2
CUDA: refactor FA support/selection code (#15454)
|
5 months ago |
Johannes Gäßler
|
7a6e91ad26
CUDA: replace GGML_CUDA_F16 with CUDA arch checks (#15433)
|
5 months ago |
Jeff Bolz
|
fec9519802
vulkan: shorten pipeline name strings (#15431)
|
5 months ago |
Daniel Bevenius
|
657b8a77bd
chat: handle gpt-oss return/end token inconsistency (#15421)
|
5 months ago |
Jie Fu (傅杰)
|
ec5ab1a36c
common : fix context shift help message (#15448)
|
5 months ago |
xiaobing318
|
1a99c2d948
cmake : fix target include directories (#15450)
|
5 months ago |
Daniel Bevenius
|
37f10f955f
make : remove make in favor of CMake (#15449)
|
5 months ago |
Georgi Gerganov
|
2f37014073
lookahead : add sample command to readme (#15447)
|
5 months ago |
R0CKSTAR
|
a094f38143
musa: fix build warnings (#15258)
|
5 months ago |
lhez
|
fb22dd07a6
opencl: mark `argsort` unsupported if cols exceed workgroup limit (#15375)
|
5 months ago |
Georgi Gerganov
|
9ef6b0b835
model : add gpt-oss type strings (#15424)
|
5 months ago |
Gian-Carlo Pascutto
|
1e19f5d462
common : Add top-nsigma sampler to help globally (#15428)
|
5 months ago |
Georgi Gerganov
|
d2fcd91cf9
server : disable context shift by default (#15416)
|
5 months ago |
SHUAI YANG
|
a6d3cfe7fa
CANN: optimize rope operator (#15335)
|
5 months ago |
R0CKSTAR
|
67f09a3a27
musa: handle __hgt2_mask, available starting from MUSA SDK rc4.3.0 (#15413)
|
5 months ago |
Marvin Gießing
|
6424594c56
ggml-cpu: add mxfp4 VSX intrinsics for Power9+ (ppc64le) hardware (#15385)
|
5 months ago |
Xuan-Son Nguyen
|
e9288e8869
chat : clarify the meaning of reasoning_format (#15408)
|
5 months ago |
Georgi Gerganov
|
9d262f4bad
server : remove swa_full warning (#15399)
|
5 months ago |
Georgi Gerganov
|
f0d3c7405c
batched-bench : use rand tokens (#15398)
|
5 months ago |