Commit History

Autor SHA1 Mensaxe Data
  Radoslav Gerganov c556418b60 llama-bench : use local GPUs along with RPC servers (#14917) hai 5 meses
  bashayer hijji fffcce535e llama-bench : add --no-warmup flag (#14224) (#14270) hai 7 meses
  Georgi Gerganov 745aa5319b llama : deprecate llama_kv_self_ API (#14030) hai 7 meses
  Max Krasnyansky 053b1539c0 threading: support for GGML_SCHED_PRIO_LOW, update thread info on Windows to avoid throttling (#12995) hai 7 meses
  Georgi Gerganov e298d2fbd0 kv-cache : add SWA support (#13194) hai 8 meses
  Diego Devesa 6c8b91500e llama-bench : fix -ot with dl backends (#13563) hai 8 meses
  Georgi Gerganov b2838049cc bench : handle decode errors (#13548) hai 8 meses
  Diego Devesa cf0a43bb64 llama-bench : add defrag-thold, check for invalid ranges (#13487) hai 8 meses
  Diego Devesa 22cdab343b llama-bench : accept ranges for integer parameters (#13410) hai 8 meses
  David Huang 7f323a589f Add `--no-op-offload` to improve `-ot` pp perf in MoE models like llama4 400B (#13386) hai 8 meses
  Diego Devesa 1d36b3670b llama : move end-user examples to tools directory (#13249) hai 8 meses