Pierrick Hymbert
|
fd72d2d2a5
server: tests: add truncated prompt tests, better kv cache size (#5933)
|
1 vuosi sitten |
Pierrick Hymbert
|
76e868821a
server: metrics: add llamacpp:prompt_seconds_total and llamacpp:tokens_predicted_seconds_total, reset bucket only on /metrics. Fix values cast to int. Add Process-Start-Time-Unix header. (#5937)
|
1 vuosi sitten |
Georgi Gerganov
|
2002bc96bf
server : refactor (#5882)
|
1 vuosi sitten |
Pierrick Hymbert
|
9731134296
server: tests: passkey challenge / self-extend with context shift demo (#5832)
|
1 vuosi sitten |
Pierrick Hymbert
|
930b178026
server: logs - unified format and --log-format option (#5700)
|
1 vuosi sitten |
Pierrick Hymbert
|
d52d7819b8
server: concurrency fix + monitoring - add /metrics prometheus compatible endpoint (#5708)
|
1 vuosi sitten |
Pierrick Hymbert
|
9e359a4f47
server: continue to update other slots on embedding concurrent request (#5699)
|
1 vuosi sitten |
Pierrick Hymbert
|
525213d2f5
server: init functional tests (#5566)
|
1 vuosi sitten |