Georgi Gerganov
|
f914544b16
batched-bench : add "separate text gen" mode (#17103)
|
пре 2 месеци |
Georgi Gerganov
|
7fd205a8e8
scripts : add script to bench models (#16894)
|
пре 2 месеци |
Georgi Gerganov
|
a885dcff11
batched-bench : fix llama_synchronize usage during prompt processing (#15835)
|
пре 4 месеци |
Johannes Gäßler
|
e81b8e4b7f
llama: use FA + max. GPU layers by default (#15434)
|
пре 4 месеци |
Georgi Gerganov
|
b3964c1e89
metal : optimize FA vec for large sequences and BS <= 8 (#15566)
|
пре 4 месеци |
Georgi Gerganov
|
6b64f74b55
batched-bench : fix unified KV cache handling + pp timing (#15562)
|
пре 4 месеци |
Georgi Gerganov
|
f0d3c7405c
batched-bench : use rand tokens (#15398)
|
пре 5 месеци |
Georgi Gerganov
|
225e7a1438
llama : add high-throughput mode (#14363)
|
пре 6 месеци |
Georgi Gerganov
|
745aa5319b
llama : deprecate llama_kv_self_ API (#14030)
|
пре 7 месеци |
Georgi Gerganov
|
b89d605a91
batched-bench : fix pp batch contents (#13492)
|
пре 8 месеци |
Diego Devesa
|
1d36b3670b
llama : move end-user examples to tools directory (#13249)
|
пре 8 месеци |