Georgi Gerganov
|
745aa5319b
llama : deprecate llama_kv_self_ API (#14030)
|
7 months ago |
Max Krasnyansky
|
053b1539c0
threading: support for GGML_SCHED_PRIO_LOW, update thread info on Windows to avoid throttling (#12995)
|
7 months ago |
Georgi Gerganov
|
e298d2fbd0
kv-cache : add SWA support (#13194)
|
8 months ago |
Diego Devesa
|
6c8b91500e
llama-bench : fix -ot with dl backends (#13563)
|
8 months ago |
Georgi Gerganov
|
b2838049cc
bench : handle decode errors (#13548)
|
8 months ago |
Diego Devesa
|
cf0a43bb64
llama-bench : add defrag-thold, check for invalid ranges (#13487)
|
8 months ago |
Diego Devesa
|
22cdab343b
llama-bench : accept ranges for integer parameters (#13410)
|
8 months ago |
David Huang
|
7f323a589f
Add `--no-op-offload` to improve `-ot` pp perf in MoE models like llama4 400B (#13386)
|
8 months ago |
Diego Devesa
|
1d36b3670b
llama : move end-user examples to tools directory (#13249)
|
8 months ago |