Olivier Chafik
|
c9bbc77931
`server`: update deepseek reasoning format (pass reasoning_content as diffs) (#13933)
|
7 miesięcy temu |
Max Krasnyansky
|
053b1539c0
threading: support for GGML_SCHED_PRIO_LOW, update thread info on Windows to avoid throttling (#12995)
|
7 miesięcy temu |
Georgi Gerganov
|
53f925074d
sync : vendor (#13901)
|
7 miesięcy temu |
Olivier Chafik
|
cdf94a1802
server: --offline mode (#13804)
|
7 miesięcy temu |
Olivier Chafik
|
e121edc432
`server`: add `--reasoning-budget 0` to disable thinking (incl. qwen3 w/ enable_thinking:false) (#13771)
|
7 miesięcy temu |
Xuan-Son Nguyen
|
797990c4bc
mtmd : add ultravox audio input (#13623)
|
7 miesięcy temu |
Sigbjørn Skjæret
|
2aa777d86d
examples : switch retrieval to llama_encode (#13685)
|
8 miesięcy temu |
Georgi Gerganov
|
a4090d1174
llama : remove llama_kv_cache_view API + remove deprecated (#13653)
|
8 miesięcy temu |
Georgi Gerganov
|
e298d2fbd0
kv-cache : add SWA support (#13194)
|
8 miesięcy temu |
Isaac McFadyen
|
6a2bc8bfb7
server : added --no-prefill-assistant flag (#13608)
|
8 miesięcy temu |
Georgi Gerganov
|
518329b2d4
parallel : add option for non-shared and larger prompts (#13598)
|
8 miesięcy temu |
David Huang
|
7f323a589f
Add `--no-op-offload` to improve `-ot` pp perf in MoE models like llama4 400B (#13386)
|
8 miesięcy temu |
Xuan-Son Nguyen
|
7fef11766c
arg : add env var to control mmproj (#13416)
|
8 miesięcy temu |
Xuan-Son Nguyen
|
33eff40240
server : vision support via libmtmd (#12898)
|
8 miesięcy temu |
Bartowski
|
efb8b47eda
imatrix : Add --parse-special for enabling parsing of special tokens in imatrix calculation (#13389)
|
8 miesięcy temu |
Georgi Gerganov
|
51fb96b1ff
context : remove logits_all flag (#13284)
|
8 miesięcy temu |
Georgi Gerganov
|
4773d7a02f
examples : remove infill (#13283)
|
8 miesięcy temu |
Xuan-Son Nguyen
|
9b61acf060
mtmd : rename llava directory to mtmd (#13311)
|
8 miesięcy temu |
Diego Devesa
|
1d36b3670b
llama : move end-user examples to tools directory (#13249)
|
8 miesięcy temu |
Georgi Gerganov
|
fab647e884
server : add cache reuse card link to help (#13230)
|
8 miesięcy temu |
Xuan-Son Nguyen
|
13c9a3319b
arg : remove CURLINFO_EFFECTIVE_METHOD (#13228)
|
8 miesięcy temu |
Xuan-Son Nguyen
|
6f67cf1f48
arg : -hf do not fail if url mismatch (#13219)
|
8 miesięcy temu |
Olivier Chafik
|
3b127c7385
common : add -jf / --json-schema-file flag (#12011)
|
8 miesięcy temu |
Xuan-Son Nguyen
|
5933e6fdc9
arg : allow using -hf offline (#13202)
|
8 miesięcy temu |
Georgi Gerganov
|
43f2b07193
common : fix noreturn compile warning (#13151)
|
8 miesięcy temu |
Xuan-Son Nguyen
|
85f36e5e71
arg : fix unused variable (#13142)
|
8 miesięcy temu |
Xuan-Son Nguyen
|
2d451c8059
common : add common_remote_get_content (#13123)
|
8 miesięcy temu |
Georgi Gerganov
|
13b4548877
cmake : do not include ./src as public for libllama (#13062)
|
8 miesięcy temu |
Xuan-Son Nguyen
|
7c727fbe39
arg : add --no-mmproj-offload (#13093)
|
8 miesięcy temu |
Xuan-Son Nguyen
|
80982e815e
arg : clean up handling --mmproj with -hf (#13082)
|
8 miesięcy temu |