Xuan Son Nguyen
|
a71d81cf8c
server : revamp chat UI with vuejs and daisyui (#10175)
|
1 год назад |
Georgi Gerganov
|
b11f9ba9b8
server : remove hack for extra parallel slot (#10187)
|
1 год назад |
Xuan Son Nguyen
|
9e0ecfb697
server : clarify /slots endpoint, add is_processing (#10162)
|
1 год назад |
sasha0552
|
42cadc74bd
server : fix slot selection by lru (#10126)
|
1 год назад |
Georgi Gerganov
|
45950415ed
server : fix endpoint checks (#10135)
|
1 год назад |
sasha0552
|
d865d1478c
server : fix smart selection of available slot (#10120)
|
1 год назад |
Kevin Gibbons
|
0a683e8088
server : include scheme when printing URL (#10106)
|
1 год назад |
Georgi Gerganov
|
8d8ff71536
llama : remove Tail-Free sampling (#10071)
|
1 год назад |
Georgi Gerganov
|
8125e6cbfc
server : don't overfill the batch during infill (#10018)
|
1 год назад |
wwoodsTM
|
ff252ea48e
llama : add DRY sampler (#9702)
|
1 год назад |
Michael Podvitskiy
|
d80fb71f8b
llama: string_split fix (#10022)
|
1 год назад |
Georgi Gerganov
|
bc5ba007b2
server : check that the prompt fits in the slot's context (#10030)
|
1 год назад |
Xuan Son Nguyen
|
958367bf53
server : refactor slot input data, move tokenizer to HTTP thread (#10023)
|
1 год назад |
wwoodsTM
|
0a1c750c80
server : samplers accept the prompt correctly (#10019)
|
1 год назад |
Xuan Son Nguyen
|
cda0e4b648
llama : remove all_pos_0, all_pos_1, all_seq_id from llama_batch (#9745)
|
1 год назад |
Georgi Gerganov
|
8901755ba3
server : add n_indent parameter for line indentation requirement (#9929)
|
1 год назад |
Alexey Parfenov
|
1f66b699c4
server : fix the disappearance of the end of the text (#9867)
|
1 год назад |
Georgi Gerganov
|
223c25a72f
server : improve infill context reuse (#9894)
|
1 год назад |
MaggotHATE
|
fbc98b748e
sampling : add XTC sampler (#9742)
|
1 год назад |
Georgi Gerganov
|
d4c19c0f5c
server : accept extra_context for the infill endpoint (#9874)
|
1 год назад |
Georgi Gerganov
|
c7181bd294
server : reuse cached context chunks (#9866)
|
1 год назад |
Georgi Gerganov
|
edc265661c
server : add option to time limit the generation phase (#9865)
|
1 год назад |
Georgi Gerganov
|
1bde94dd02
server : remove self-extend features (#9860)
|
1 год назад |
Georgi Gerganov
|
95c76e8e92
server : remove legacy system_prompt feature (#9857)
|
1 год назад |
Georgi Gerganov
|
11ac9800af
llama : improve infill support and special token detection (#9798)
|
1 год назад |
Diego Devesa
|
7eee341bee
common : use common_ prefix for common library functions (#9805)
|
1 год назад |
Xuan Son Nguyen
|
458367a906
server : better security control for public deployments (#9776)
|
1 год назад |
Georgi Gerganov
|
8c475b97b8
rerank : use [SEP] token instead of [BOS] (#9737)
|
1 год назад |
Georgi Gerganov
|
f4d2b8846a
llama : add reranking support (#9510)
|
1 год назад |
Xuan Son Nguyen
|
afbbfaa537
server : add more env vars, improve gen-docs (#9635)
|
1 год назад |