cturan/llama.cpp

réplica de https://github.com/cturan/llama.cpp

Autor	SHA1 Mensaxe	Data
Georgi Gerganov	d00cbea63c server : host-memory prompt caching (#16391)	hai 3 meses
Johannes Gäßler	e81b8e4b7f llama: use FA + max. GPU layers by default (#15434)	hai 4 meses
Georgi Gerganov	d2fcd91cf9 server : disable context shift by default (#15416)	hai 5 meses
Xuan-Son Nguyen	6aa892ec2a server : do not return error out of context (with ctx shift disabled) (#13577)	hai 8 meses
Diego Devesa	1d36b3670b llama : move end-user examples to tools directory (#13249)	hai 8 meses