Aleksander Grygier
|
acb73d8340
webui: Add editing attachments in user messages (#18147)
|
4 weeks ago |
Daniel Bevenius
|
0a271d82b4
model-conversion : add verbose flag in run-org-model.py (#18194)
|
4 weeks ago |
Naco Siren
|
52fc7fee8a
android: fix missing screenshots for Android.md (#18156)
|
4 weeks ago |
Jeff Bolz
|
cdbada8d10
vulkan: Add perf logger mode with concurrency (#17944)
|
4 weeks ago |
Xuan-Son Nguyen
|
8ea958d4d9
model : add ASR support for LFM2-Audio-1.5B (conformer) (#18106)
|
4 weeks ago |
Pascal
|
f9ec8858ed
webui: display prompt processing stats (#18146)
|
4 weeks ago |
Taimur Ahmad
|
f716588e63
ggml-cpu: extend support for RVV floating-point kernels (#17318)
|
1 month ago |
Xuan-Son Nguyen
|
4d1316c440
arg: fix ASAN error on sampler_type_names empty (#18167)
|
1 month ago |
Sigbjørn Skjæret
|
ec7b9329ae
gguf-py : use copy-on-write mode for localtensor (#18162)
|
1 month ago |
yulo
|
54189c0d39
remove i_major_dual (#18157)
|
1 month ago |
Aleksander Grygier
|
9ce64aed7d
webui: Fix selecting generated output issues during active streaming (#18091)
|
1 month ago |
Kim S.
|
900316da4e
webui: fix chat screen shadow width (#18010)
|
1 month ago |
Johannes Gäßler
|
57c1e05643
llama: offload output layer to GPU first (#18148)
|
1 month ago |
Sigbjørn Skjæret
|
9cff4cc554
convert : sort and use file parts from model index if present (#18043)
|
1 month ago |
Julius Tischbein
|
4d4f4cacd1
llama : Async DirectIO model loading on Linux (#18012)
|
1 month ago |
Shouyu
|
0a0bba05e8
ggml-hexagon: swiglu_oai operation (#18114)
|
1 month ago |
Sigbjørn Skjæret
|
5166aaf868
convert : force patch_merger tensors to f16/f32 (#18124)
|
1 month ago |
Pascal
|
6ce3d85796
server: (webui) add --webui-config (#18028)
|
1 month ago |
Xuan-Son Nguyen
|
e85e9d7637
server: (router) disable SSL on child process (#18141)
|
1 month ago |
Johannes Gäßler
|
8dcc3662a2
llama-fit-params: fix memory print (#18136)
|
1 month ago |
Kim S.
|
d37fc93505
webui: fix chat header width when sidebar is closed (#17981)
|
1 month ago |
Shouyu
|
4470a0764a
ggml-hexagon: gelu operation (#17921)
|
1 month ago |
Georgi Gerganov
|
4301e27319
common : restore grammar-based rejection sampling (#18137)
|
1 month ago |
Johannes Gäßler
|
a2c199e479
common: clarify instructions for bug reports (#18134)
|
1 month ago |
HonestQiao
|
15dd67d869
model: fix GLM-ASR-Nano-2512 load error (#18130) (#18142)
|
1 month ago |
Xuan-Son Nguyen
|
bde461de8c
server: (router) allow child process to report status via stdout (#18110)
|
1 month ago |
Piotr Wilkin (ilintar)
|
8faa87db02
Extend run-org-model.py, add (a) batching (b) loading prompt from file (c) multimodal capacity (#18034)
|
1 month ago |
Johannes Gäßler
|
6f1f6a961a
Github: ask for -v logs for params_fit [no ci] (#18128)
|
1 month ago |
Alberto Cabrera Pérez
|
669696e00d
ggml-cpu: ARM64: repack version of q8_0 (dotprod and i8mm) (#18096)
|
1 month ago |
Tarek Dakhran
|
982060fadc
model: fix LFM2_MOE missing tensors (#18132)
|
1 month ago |