Georgi Gerganov
|
16bcc1259d
kv-cache : pad the cache size to 256 for performance (#17046)
|
2 ヶ月 前 |
Johannes Gäßler
|
e81b8e4b7f
llama: use FA + max. GPU layers by default (#15434)
|
4 ヶ月 前 |
Georgi Gerganov
|
d2fcd91cf9
server : disable context shift by default (#15416)
|
5 ヶ月 前 |
Diego Devesa
|
1d36b3670b
llama : move end-user examples to tools directory (#13249)
|
8 ヶ月 前 |