Naco Siren
|
5c0d18881e
llama.android : Rewrite Android binding (w/o cpu_features dep) (#17413)
|
1 month ago |
TrevorS
|
4b2a4778f8
arg: allow -kvu flag for llama-perplexity (#18117)
|
1 month ago |
Aadeshveer Singh
|
58062860af
ggml : use WARP_SIZE/2 for argmax reduction offset (#18092)
|
1 month ago |
Yuri Khrustalev
|
2973a65ecb
gguf-py : allow converting multi-tensor models from read-only locations (#18100)
|
1 month ago |
Johannes Gäßler
|
d0794e89d9
llama-fit-params: force disable mlock (#18103)
|
1 month ago |
Johannes Gäßler
|
9dcac6cf9f
llama-fit-params: lower ctx size for multi GPU (#18101)
|
1 month ago |
Johannes Gäßler
|
0e49a7b8b4
llama-fit-params: fix underflow for dense models (#18095)
|
1 month ago |
Johannes Gäßler
|
4164596c76
llama-fit-params: QoL impr. for prints/errors (#18089)
|
1 month ago |
Xuan-Son Nguyen
|
ef83fb8601
model: fix LFM2 missing tensors (#18105)
|
1 month ago |
Johannes Gäßler
|
ec98e20021
llama: fix early stop in params_fit if ctx is set (#18070)
|
1 month ago |
yifant-code
|
59977eba7b
server: fix crash when batch > ubatch with embeddings (#17912)
|
1 month ago |
Daniel Bevenius
|
79dbae034a
model-conversion : remove -fa option in model card template [no ci] (#18088)
|
1 month ago |
Xuan-Son Nguyen
|
7f2b2f3c77
arch: refactor LLM_TENSOR_NAMES (#18051)
|
1 month ago |
Xuan-Son Nguyen
|
7b1db3d3b7
arg: clarify auto kvu/np being set on server (#17997)
|
1 month ago |
Piotr Wilkin (ilintar)
|
a5251ca11d
Optimization: Qwen3 next autoregressive pass (#17996)
|
1 month ago |
Andrew Aladjev
|
fb644247de
CLI: fixed adding cli and completion into docker containers, improved docs (#18003)
|
1 month ago |
2114L3
|
5f5f9b4637
server: Update README.md incorrect argument (#18073)
|
1 month ago |
Xuan-Son Nguyen
|
3d86c6c2b5
model: support GLM4V vision encoder (#18042)
|
1 month ago |
Daniel Bevenius
|
9963b81f63
model-conversion : add note about verifying previous models (#18082)
|
1 month ago |
Daniel Bevenius
|
db81d5ec4b
model-conversion : use CONVERTED_EMBEDDING_MODEL for embedding_verify_logits (#18079)
|
1 month ago |
Aldehir Rojas
|
c05aa69f32
common : add nemotron 3 parsing (#18077)
|
1 month ago |
Francisco Herrera
|
279cef27c2
added note for old Intel hardware pre sycl (#18017)
|
1 month ago |
Georgi Gerganov
|
5ba95754ee
security : add collaborator guidance (#18081)
|
1 month ago |
Chris Peterson
|
2aa45ef9e3
llama: Include algorithm header needed for C++23 (#18078)
|
1 month ago |
Georgi Gerganov
|
c560316440
graph : reuse SSM graphs (#16490)
|
1 month ago |
Sigbjørn Skjæret
|
d6742125c3
ci : separate webui from server (#18072)
|
1 month ago |
Aleksander Grygier
|
3034836d36
webui: Improve copy to clipboard with text attachments (#17969)
|
1 month ago |
Aleksander Grygier
|
a20979d433
webui: Add setting to always show sidebar on Desktop (#17809)
|
1 month ago |
Daniel Bevenius
|
2995341730
llama : add support for NVIDIA Nemotron 3 Nano (#18058)
|
1 month ago |
Darius Lukas
|
40d9c394f4
Webui: Disable attachment button and model selector button when prompt textbox is disabled. (#17925)
|
1 month ago |