Georgi Gerganov
|
7956bb4d7f
bench : cache the llama_context state at computed depth (#16944)
|
2 bulan lalu |
Gadflyii
|
3df2244df4
llama : add --no-host to disable host buffers (#16310)
|
3 bulan lalu |
Radoslav Gerganov
|
898acba681
rpc : add support for multiple devices (#16276)
|
3 bulan lalu |
ssweens
|
be79d9fdd9
llama-bench: add --devices and --list-devices support (#16039)
|
3 bulan lalu |
jacekpoplawski
|
8ff206097c
llama-bench: add --n-cpu-moe support (#15952)
|
4 bulan lalu |
Diego Devesa
|
360d6533db
ggml-backend : add GGML_BACKEND_DEVICE_TYPE_IGPU device type (#15797)
|
4 bulan lalu |
Johannes Gäßler
|
e81b8e4b7f
llama: use FA + max. GPU layers by default (#15434)
|
4 bulan lalu |
Georgi Gerganov
|
9ebebef62f
llama : remove KV cache defragmentation logic (#15473)
|
4 bulan lalu |
Juk Armstrong
|
476aa3fd57
Fixed name `-override-tensors` to `-override-tensor` (#15129)
|
5 bulan lalu |
R0CKSTAR
|
3025b621d1
llama-bench: rename DB table name from test to llama_bench (#15003)
|
5 bulan lalu |
Radoslav Gerganov
|
c556418b60
llama-bench : use local GPUs along with RPC servers (#14917)
|
5 bulan lalu |
bashayer hijji
|
fffcce535e
llama-bench : add --no-warmup flag (#14224) (#14270)
|
7 bulan lalu |
Georgi Gerganov
|
745aa5319b
llama : deprecate llama_kv_self_ API (#14030)
|
7 bulan lalu |
Max Krasnyansky
|
053b1539c0
threading: support for GGML_SCHED_PRIO_LOW, update thread info on Windows to avoid throttling (#12995)
|
7 bulan lalu |
Georgi Gerganov
|
e298d2fbd0
kv-cache : add SWA support (#13194)
|
8 bulan lalu |
Diego Devesa
|
6c8b91500e
llama-bench : fix -ot with dl backends (#13563)
|
8 bulan lalu |
Georgi Gerganov
|
b2838049cc
bench : handle decode errors (#13548)
|
8 bulan lalu |
Diego Devesa
|
cf0a43bb64
llama-bench : add defrag-thold, check for invalid ranges (#13487)
|
8 bulan lalu |
Diego Devesa
|
22cdab343b
llama-bench : accept ranges for integer parameters (#13410)
|
8 bulan lalu |
David Huang
|
7f323a589f
Add `--no-op-offload` to improve `-ot` pp perf in MoE models like llama4 400B (#13386)
|
8 bulan lalu |
Diego Devesa
|
1d36b3670b
llama : move end-user examples to tools directory (#13249)
|
8 bulan lalu |