cturan/llama.cpp

Autor	SHA1 Mensaje	Fecha
Georgi Gerganov	f914544b16 batched-bench : add "separate text gen" mode (#17103)	hace 2 meses
Georgi Gerganov	7fd205a8e8 scripts : add script to bench models (#16894)	hace 2 meses
Georgi Gerganov	a885dcff11 batched-bench : fix llama_synchronize usage during prompt processing (#15835)	hace 4 meses
Johannes Gäßler	e81b8e4b7f llama: use FA + max. GPU layers by default (#15434)	hace 4 meses
Georgi Gerganov	b3964c1e89 metal : optimize FA vec for large sequences and BS <= 8 (#15566)	hace 4 meses
Georgi Gerganov	6b64f74b55 batched-bench : fix unified KV cache handling + pp timing (#15562)	hace 4 meses
Georgi Gerganov	f0d3c7405c batched-bench : use rand tokens (#15398)	hace 5 meses
Georgi Gerganov	225e7a1438 llama : add high-throughput mode (#14363)	hace 6 meses
Georgi Gerganov	745aa5319b llama : deprecate llama_kv_self_ API (#14030)	hace 7 meses
Georgi Gerganov	b89d605a91 batched-bench : fix pp batch contents (#13492)	hace 8 meses
Diego Devesa	1d36b3670b llama : move end-user examples to tools directory (#13249)	hace 8 meses