cturan/llama.cpp

Autor	SHA1 Mensaxe	Data
Radoslav Gerganov	c556418b60 llama-bench : use local GPUs along with RPC servers (#14917)	hai 5 meses
bashayer hijji	fffcce535e llama-bench : add --no-warmup flag (#14224) (#14270)	hai 7 meses
Georgi Gerganov	745aa5319b llama : deprecate llama_kv_self_ API (#14030)	hai 7 meses
Max Krasnyansky	053b1539c0 threading: support for GGML_SCHED_PRIO_LOW, update thread info on Windows to avoid throttling (#12995)	hai 7 meses
Georgi Gerganov	e298d2fbd0 kv-cache : add SWA support (#13194)	hai 8 meses
Diego Devesa	6c8b91500e llama-bench : fix -ot with dl backends (#13563)	hai 8 meses
Georgi Gerganov	b2838049cc bench : handle decode errors (#13548)	hai 8 meses
Diego Devesa	cf0a43bb64 llama-bench : add defrag-thold, check for invalid ranges (#13487)	hai 8 meses
Diego Devesa	22cdab343b llama-bench : accept ranges for integer parameters (#13410)	hai 8 meses
David Huang	7f323a589f Add `--no-op-offload` to improve `-ot` pp perf in MoE models like llama4 400B (#13386)	hai 8 meses
Diego Devesa	1d36b3670b llama : move end-user examples to tools directory (#13249)	hai 8 meses