Jeff Bolz
|
52ab19df63
tests: Avoid floating point precision false positives in SUM (#17471)
|
3 週間 前 |
Jeff Bolz
|
5182dd64cd
test-backend-ops: improve msvc build time (#18209)
|
3 週間 前 |
Aadeshveer Singh
|
10b4f82d44
Added comments explaining thread block size selection logic based on row count and column size, derived from historical commit context (#18212)
|
3 週間 前 |
Oleksandr Kuvshynov
|
408616adbd
server : [easy] fix per round speculative decode logging (#18211)
|
3 週間 前 |
Xuan-Son Nguyen
|
9e39a1e6a9
server: support load model on startup, support preset-only options (#18206)
|
3 週間 前 |
Sigbjørn Skjæret
|
74e05131e9
ci : remove non-windows zip artifacts (#18201)
|
4 週間 前 |
Sigbjørn Skjæret
|
f74747d886
ci : only save ccache on master (#18207)
|
4 週間 前 |
Alfred
|
ce734a8a2f
ggml-hexagon: Implement true Q8_0 quantization on Hexagon NPU for more accurate mixed-precision matmul operations (#17977)
|
4 週間 前 |
Pascal
|
14931a826e
arg: fix order to use short form before long form (#18196)
|
4 週間 前 |
Julius Tischbein
|
f99ef53d2a
llama : Changing off_t to size_t for Windows (#18204)
|
4 週間 前 |
Aman Gupta
|
cc0a04343e
server: friendlier error msg when ctx < input (#18174)
|
4 週間 前 |
Xuan-Son Nguyen
|
98c1c7a7bf
presets: refactor, allow cascade presets from different sources, add global section (#18169)
|
4 週間 前 |
Aleksander Grygier
|
acb73d8340
webui: Add editing attachments in user messages (#18147)
|
4 週間 前 |
Daniel Bevenius
|
0a271d82b4
model-conversion : add verbose flag in run-org-model.py (#18194)
|
4 週間 前 |
Naco Siren
|
52fc7fee8a
android: fix missing screenshots for Android.md (#18156)
|
4 週間 前 |
Jeff Bolz
|
cdbada8d10
vulkan: Add perf logger mode with concurrency (#17944)
|
4 週間 前 |
Xuan-Son Nguyen
|
8ea958d4d9
model : add ASR support for LFM2-Audio-1.5B (conformer) (#18106)
|
4 週間 前 |
Pascal
|
f9ec8858ed
webui: display prompt processing stats (#18146)
|
4 週間 前 |
Taimur Ahmad
|
f716588e63
ggml-cpu: extend support for RVV floating-point kernels (#17318)
|
4 週間 前 |
Xuan-Son Nguyen
|
4d1316c440
arg: fix ASAN error on sampler_type_names empty (#18167)
|
4 週間 前 |
Sigbjørn Skjæret
|
ec7b9329ae
gguf-py : use copy-on-write mode for localtensor (#18162)
|
4 週間 前 |
yulo
|
54189c0d39
remove i_major_dual (#18157)
|
4 週間 前 |
Aleksander Grygier
|
9ce64aed7d
webui: Fix selecting generated output issues during active streaming (#18091)
|
4 週間 前 |
Kim S.
|
900316da4e
webui: fix chat screen shadow width (#18010)
|
4 週間 前 |
Johannes Gäßler
|
57c1e05643
llama: offload output layer to GPU first (#18148)
|
4 週間 前 |
Sigbjørn Skjæret
|
9cff4cc554
convert : sort and use file parts from model index if present (#18043)
|
4 週間 前 |
Julius Tischbein
|
4d4f4cacd1
llama : Async DirectIO model loading on Linux (#18012)
|
4 週間 前 |
Shouyu
|
0a0bba05e8
ggml-hexagon: swiglu_oai operation (#18114)
|
1 ヶ月 前 |
Sigbjørn Skjæret
|
5166aaf868
convert : force patch_merger tensors to f16/f32 (#18124)
|
1 ヶ月 前 |
Pascal
|
6ce3d85796
server: (webui) add --webui-config (#18028)
|
1 ヶ月 前 |