Georgi Gerganov
|
d00cbea63c
server : host-memory prompt caching (#16391)
|
há 3 meses atrás |
65a
|
4afb0a746f
server : Support multimodal completion and embeddings prompts in JSON format (#15108)
|
há 4 meses atrás |
Georgi Gerganov
|
d2fcd91cf9
server : disable context shift by default (#15416)
|
há 5 meses atrás |
Lukas Straub
|
a9f77a8be3
server : add openai-style logit_bias support (#14946)
|
há 5 meses atrás |
Olivier Chafik
|
f13847cfb5
server: fix regression on streamed non-chat completion w/ stops (#13785)
|
há 7 meses atrás |
Xuan-Son Nguyen
|
360a9c98e1
server : fix cache_tokens bug with no cache_prompt (#13533)
|
há 8 meses atrás |
Diego Devesa
|
1d36b3670b
llama : move end-user examples to tools directory (#13249)
|
há 8 meses atrás |