makomk
|
ee8243adaa
server : fix crash with multimodal models without BOS token (#4904)
|
2 лет назад |
slaren
|
e7e4df031b
llama : ggml-backend integration (#4766)
|
2 лет назад |
Georgi Gerganov
|
1d118386fe
server : fix infill when prompt is empty (#4833)
|
2 лет назад |
Laura
|
4330bd83fe
server : implement credentialed CORS (#4514)
|
2 лет назад |
Michael Coppola
|
27379455c3
server : support for multiple api keys (#4864)
|
2 лет назад |
Behnam M
|
eab6795006
server : add `LOG_INFO` when model is successfully loaded (#4881)
|
2 лет назад |
Isaac McFadyen
|
2f043328e3
server : fix typo in model name (#4876)
|
2 лет назад |
Georgi Gerganov
|
5c1980d8d4
server : fix build + rename enums (#4870)
|
2 лет назад |
Behnam M
|
cd108e641d
server : add a `/health` endpoint (#4860)
|
2 лет назад |
Georgi Gerganov
|
67984921a7
server : fix n_predict check (#4798)
|
2 лет назад |
Georgi Gerganov
|
012cf349ae
server : send token probs for "stream == false" (#4714)
|
2 лет назад |
Georgi Gerganov
|
32866c5edd
editorconfig : fix whitespace and indentation #4710
|
2 лет назад |
minarchist
|
5d7002d437
server : add --override-kv parameter (#4710)
|
2 лет назад |
Georgi Gerganov
|
9fbda719de
clip : refactor + bug fixes (#4696)
|
2 лет назад |
Justine Tunney
|
db49ff8ed7
server : replace sleep with condition variables (#4673)
|
2 лет назад |
SakuraUmi
|
60f55e888c
server : fix OpenAI server sampling w.r.t. penalty. (#4675)
|
2 лет назад |
Karthik Sethuraman
|
b93edd22f5
server : allow to generate multimodal embeddings (#4681)
|
2 лет назад |
Justine Tunney
|
65e5f6dadb
Fix OpenAI server sampling w.r.t. temp and seed (#4668)
|
2 лет назад |
Alexey Parfenov
|
6123979952
server : allow to specify custom prompt for penalty calculation (#3727)
|
2 лет назад |
olexiyb
|
0ffc92d2d2
server : disable llm logs if SERVER_VERBOSE is off (#3792)
|
2 лет назад |
AdithyanI
|
8edd2b40fd
server : fix grammar being ignored (#4494)
|
2 лет назад |
Alexey Parfenov
|
eb16dae7e7
server : fix possible ambiguity in content type charset (#4501)
|
2 лет назад |
mzcu
|
62bd52b7bf
server : allow requests larger than 8K (#4500)
|
2 лет назад |
ShadovvBeast
|
88ae8952b6
server : add optional API Key Authentication example (#4441)
|
2 лет назад |
shibe2
|
948ff137ec
server : fix handling of characters that span multiple tokens when streaming (#4446)
|
2 лет назад |
Vladimir Zorin
|
d9d4cfef64
server : fix local model name in server (#4420)
|
2 лет назад |
Georgi Gerganov
|
bcc0eb4591
llama : per-layer KV cache + quantum K cache (#4309)
|
2 лет назад |
Georgi Gerganov
|
05cd6e5036
server : recognize cache_prompt parameter in OAI API (#4347)
|
2 лет назад |
Ed Lee
|
33e171d1e9
server : fix OpenAI API `stop` field to be optional (#4299)
|
2 лет назад |
Georgi Gerganov
|
d5a1cbde60
llama : support optional tensors (#4283)
|
2 лет назад |