Georgi Gerganov
|
9fbda719de
clip : refactor + bug fixes (#4696)
|
пре 2 година |
Justine Tunney
|
db49ff8ed7
server : replace sleep with condition variables (#4673)
|
пре 2 година |
SakuraUmi
|
60f55e888c
server : fix OpenAI server sampling w.r.t. penalty. (#4675)
|
пре 2 година |
Karthik Sethuraman
|
b93edd22f5
server : allow to generate multimodal embeddings (#4681)
|
пре 2 година |
Justine Tunney
|
65e5f6dadb
Fix OpenAI server sampling w.r.t. temp and seed (#4668)
|
пре 2 година |
Alexey Parfenov
|
6123979952
server : allow to specify custom prompt for penalty calculation (#3727)
|
пре 2 година |
olexiyb
|
0ffc92d2d2
server : disable llm logs if SERVER_VERBOSE is off (#3792)
|
пре 2 година |
AdithyanI
|
8edd2b40fd
server : fix grammar being ignored (#4494)
|
пре 2 година |
Alexey Parfenov
|
eb16dae7e7
server : fix possible ambiguity in content type charset (#4501)
|
пре 2 година |
mzcu
|
62bd52b7bf
server : allow requests larger than 8K (#4500)
|
пре 2 година |
ShadovvBeast
|
88ae8952b6
server : add optional API Key Authentication example (#4441)
|
пре 2 година |
shibe2
|
948ff137ec
server : fix handling of characters that span multiple tokens when streaming (#4446)
|
пре 2 година |
Vladimir Zorin
|
d9d4cfef64
server : fix local model name in server (#4420)
|
пре 2 година |
Georgi Gerganov
|
bcc0eb4591
llama : per-layer KV cache + quantum K cache (#4309)
|
пре 2 година |
Georgi Gerganov
|
05cd6e5036
server : recognize cache_prompt parameter in OAI API (#4347)
|
пре 2 година |
Ed Lee
|
33e171d1e9
server : fix OpenAI API `stop` field to be optional (#4299)
|
пре 2 година |
Georgi Gerganov
|
d5a1cbde60
llama : support optional tensors (#4283)
|
пре 2 година |
Ziad Ben Hadj-Alouane
|
1d144112c0
server : add --log-disable to disable logging to file (#4260)
|
пре 2 година |
Ziad Ben Hadj-Alouane
|
f43f09366d
server : add single-client multi-prompt support (#4232)
|
пре 2 година |
Georgi Gerganov
|
af19d35734
server : OAI API compatibility (#4198)
|
пре 2 година |
Haohui Mai
|
55978ce09b
Fix incorrect format strings and uninitialized variables. (#4133)
|
пре 2 година |
SoftwareRenderer
|
936c79b227
server : relay error messages (#4131)
|
пре 2 година |
Kerfuffle
|
91f6499393
Respect tokenizer.ggml.add_bos_token value when tokenizing (#4040)
|
пре 2 година |
Alexey Parfenov
|
d96ca7ded7
server : fix crash when prompt exceeds context size (#3996)
|
пре 2 година |
Mihai
|
57ad015dc3
server : add min_p param (#3877)
|
пре 2 година |
cebtenzzre
|
b12fa0d1c1
build : link against build info instead of compiling against it (#3879)
|
пре 2 година |
cebtenzzre
|
898aeca90a
llama : implement YaRN RoPE scaling (#2268)
|
пре 2 година |
Adrian Hesketh
|
ca190bca8e
server : re-enable completion and embedded at the same time (#3876)
|
пре 2 година |
Kerfuffle
|
6e08281e58
Extend llama_kv_cache_seq_rm to allow matching any sequence (#3843)
|
пре 2 година |
Georgi Gerganov
|
34b2a5e1ee
server : do not release slot on image input (#3798)
|
пре 2 година |