Faisal Zaghloul
|
42c76d1358
Threadpool: take 2 (#8672)
|
1 年之前 |
Xuan Son Nguyen
|
a77feb5d71
server : add some missing env variables (#9116)
|
1 年之前 |
Justine Tunney
|
436787f170
llama : fix time complexity of string replacement (#9163)
|
1 年之前 |
Herman Semenov
|
93bc3839f9
common: fixed not working find argument --n-gpu-layers-draft (#9175)
|
1 年之前 |
Xuan Son Nguyen
|
fc54ef0d1c
server : support reading arguments from environment variables (#9105)
|
1 年之前 |
Liu Jia
|
fb487bb567
common : add support for cpu_get_num_physical_cores() on Windows (#8771)
|
1 年之前 |
Zhenwei Jin
|
4af8420afb
common : remove duplicate function llama_should_add_bos_token (#8778)
|
1 年之前 |
fairydreaming
|
7c3f55c100
Add support for encoder-only T5 models (#8900)
|
1 年之前 |
Georgi Gerganov
|
45a55b91aa
llama : better replace_all (cont) (#8926)
|
1 年之前 |
Xuan Son Nguyen
|
1e6f6554aa
server : add lora hotswap endpoint (WIP) (#8857)
|
1 年之前 |
Liu Jia
|
0a4ce78681
common : Changed tuple to struct (TODO fix) (#8823)
|
1 年之前 |
Igor Okulist
|
afbbcf3c04
server : update llama-server embedding flag documentation (#8779)
|
1 年之前 |
Daniel Bevenius
|
9d03d085dd
common : add --no-warmup option for main/llama-cli (#8712)
|
1 年之前 |
Xuan Son Nguyen
|
96952e7181
llama : fix `llama_chat_format_single` for mistral (#8657)
|
1 年之前 |
Xuan Son Nguyen
|
de280085e7
examples : Fix `llama-export-lora` example (#8607)
|
1 年之前 |
Xuan Son Nguyen
|
97bdd26eee
Refactor lora adapter support (#8332)
|
1 年之前 |
Georgi Gerganov
|
9104bc20ed
common : add --no-cont-batching arg (#6358)
|
1 年之前 |
Borislav Stanimirov
|
7a80710d93
msvc : silence codecvt c++17 deprecation warnings (#8395)
|
1 年之前 |
Derrick T. Woolworth
|
86e7299ef5
added support for Authorization Bearer tokens when downloading model (#8307)
|
1 年之前 |
jaime-m-p
|
213701b51a
Detokenizer fixes (#8039)
|
1 年之前 |
Douglas Hanley
|
d12f781074
llama : streamline embeddings from "non-embedding" models (#8087)
|
1 年之前 |
Xuan Son Nguyen
|
a38b884c6c
cli: add EOT when user hit Ctrl+C (#8296)
|
1 年之前 |
fairydreaming
|
807b0c49ff
Inference support for T5 and FLAN-T5 model families (#5763)
|
1 年之前 |
MistApproach
|
a27152b602
fix: add missing short command line argument -mli for multiline-input (#8261)
|
1 年之前 |
Xuan Son Nguyen
|
9ef0780062
Fix new line issue with chat template, disable template when in-prefix/suffix is set (#8203)
|
1 年之前 |
Sigbjørn Skjæret
|
38373cfbab
Add SPM infill support (#8016)
|
1 年之前 |
Xuan Son Nguyen
|
16791b8f0b
Add chatml fallback for cpp `llama_chat_apply_template` (#8160)
|
1 年之前 |
jukofyork
|
97877eb10b
Control vector loading fixes (#8137)
|
1 年之前 |
Xuan Son Nguyen
|
49c03c79cd
cvector: better prompt handling, add "mean vector" method (#8069)
|
1 年之前 |
Xuan Son Nguyen
|
48e6b92cc3
Add chat template support for llama-cli (#8068)
|
1 年之前 |