Xuan-Son Nguyen
|
3c3635d2f2
server : speed up tests (#15836)
|
vor 4 Monaten |
Johannes Gäßler
|
e81b8e4b7f
llama: use FA + max. GPU layers by default (#15434)
|
vor 4 Monaten |
Olivier Chafik
|
f5cd27b71d
`server`: streaming of tool calls and thoughts when `--jinja` is on (#12379)
|
vor 7 Monaten |
Diego Devesa
|
1d36b3670b
llama : move end-user examples to tools directory (#13249)
|
vor 8 Monaten |
Olivier Chafik
|
669912d9a5
`tool-call`: fix Qwen 2.5 Coder support, add micro benchmarks, support trigger patterns for lazy grammars (#12034)
|
vor 10 Monaten |