Shouyu
|
0a0bba05e8
ggml-hexagon: swiglu_oai operation (#18114)
|
1 bulan lalu |
Sigbjørn Skjæret
|
5166aaf868
convert : force patch_merger tensors to f16/f32 (#18124)
|
1 bulan lalu |
Pascal
|
6ce3d85796
server: (webui) add --webui-config (#18028)
|
1 bulan lalu |
Xuan-Son Nguyen
|
e85e9d7637
server: (router) disable SSL on child process (#18141)
|
1 bulan lalu |
Johannes Gäßler
|
8dcc3662a2
llama-fit-params: fix memory print (#18136)
|
1 bulan lalu |
Kim S.
|
d37fc93505
webui: fix chat header width when sidebar is closed (#17981)
|
1 bulan lalu |
Shouyu
|
4470a0764a
ggml-hexagon: gelu operation (#17921)
|
1 bulan lalu |
Georgi Gerganov
|
4301e27319
common : restore grammar-based rejection sampling (#18137)
|
1 bulan lalu |
Johannes Gäßler
|
a2c199e479
common: clarify instructions for bug reports (#18134)
|
1 bulan lalu |
HonestQiao
|
15dd67d869
model: fix GLM-ASR-Nano-2512 load error (#18130) (#18142)
|
1 bulan lalu |
Xuan-Son Nguyen
|
bde461de8c
server: (router) allow child process to report status via stdout (#18110)
|
1 bulan lalu |
Piotr Wilkin (ilintar)
|
8faa87db02
Extend run-org-model.py, add (a) batching (b) loading prompt from file (c) multimodal capacity (#18034)
|
1 bulan lalu |
Johannes Gäßler
|
6f1f6a961a
Github: ask for -v logs for params_fit [no ci] (#18128)
|
1 bulan lalu |
Alberto Cabrera Pérez
|
669696e00d
ggml-cpu: ARM64: repack version of q8_0 (dotprod and i8mm) (#18096)
|
1 bulan lalu |
Tarek Dakhran
|
982060fadc
model: fix LFM2_MOE missing tensors (#18132)
|
1 bulan lalu |
Sigbjørn Skjæret
|
6853bee680
ci : clean up webui jobs (#18116)
|
1 bulan lalu |
Pascal
|
487674fbb3
common: fix --override-kv to support comma-separated values (#18056)
|
1 bulan lalu |
yulo
|
acec774ef6
HIP: Refactor mma for RDNA and CDNA (#17990)
|
1 bulan lalu |
Naco Siren
|
5c0d18881e
llama.android : Rewrite Android binding (w/o cpu_features dep) (#17413)
|
1 bulan lalu |
TrevorS
|
4b2a4778f8
arg: allow -kvu flag for llama-perplexity (#18117)
|
1 bulan lalu |
Aadeshveer Singh
|
58062860af
ggml : use WARP_SIZE/2 for argmax reduction offset (#18092)
|
1 bulan lalu |
Yuri Khrustalev
|
2973a65ecb
gguf-py : allow converting multi-tensor models from read-only locations (#18100)
|
1 bulan lalu |
Johannes Gäßler
|
d0794e89d9
llama-fit-params: force disable mlock (#18103)
|
1 bulan lalu |
Johannes Gäßler
|
9dcac6cf9f
llama-fit-params: lower ctx size for multi GPU (#18101)
|
1 bulan lalu |
Johannes Gäßler
|
0e49a7b8b4
llama-fit-params: fix underflow for dense models (#18095)
|
1 bulan lalu |
Johannes Gäßler
|
4164596c76
llama-fit-params: QoL impr. for prints/errors (#18089)
|
1 bulan lalu |
Xuan-Son Nguyen
|
ef83fb8601
model: fix LFM2 missing tensors (#18105)
|
1 bulan lalu |
Johannes Gäßler
|
ec98e20021
llama: fix early stop in params_fit if ctx is set (#18070)
|
1 bulan lalu |
yifant-code
|
59977eba7b
server: fix crash when batch > ubatch with embeddings (#17912)
|
1 bulan lalu |
Daniel Bevenius
|
79dbae034a
model-conversion : remove -fa option in model card template [no ci] (#18088)
|
1 bulan lalu |