Xuan Son Nguyen
|
fc54ef0d1c
server : support reading arguments from environment variables (#9105)
|
hai 1 ano |
Liu Jia
|
fb487bb567
common : add support for cpu_get_num_physical_cores() on Windows (#8771)
|
hai 1 ano |
Zhenwei Jin
|
4af8420afb
common : remove duplicate function llama_should_add_bos_token (#8778)
|
hai 1 ano |
fairydreaming
|
7c3f55c100
Add support for encoder-only T5 models (#8900)
|
hai 1 ano |
Georgi Gerganov
|
45a55b91aa
llama : better replace_all (cont) (#8926)
|
hai 1 ano |
Xuan Son Nguyen
|
1e6f6554aa
server : add lora hotswap endpoint (WIP) (#8857)
|
hai 1 ano |
Liu Jia
|
0a4ce78681
common : Changed tuple to struct (TODO fix) (#8823)
|
hai 1 ano |
Igor Okulist
|
afbbcf3c04
server : update llama-server embedding flag documentation (#8779)
|
hai 1 ano |
Daniel Bevenius
|
9d03d085dd
common : add --no-warmup option for main/llama-cli (#8712)
|
hai 1 ano |
Xuan Son Nguyen
|
96952e7181
llama : fix `llama_chat_format_single` for mistral (#8657)
|
hai 1 ano |
Xuan Son Nguyen
|
de280085e7
examples : Fix `llama-export-lora` example (#8607)
|
hai 1 ano |
Xuan Son Nguyen
|
97bdd26eee
Refactor lora adapter support (#8332)
|
hai 1 ano |
Georgi Gerganov
|
9104bc20ed
common : add --no-cont-batching arg (#6358)
|
hai 1 ano |
Borislav Stanimirov
|
7a80710d93
msvc : silence codecvt c++17 deprecation warnings (#8395)
|
hai 1 ano |
Derrick T. Woolworth
|
86e7299ef5
added support for Authorization Bearer tokens when downloading model (#8307)
|
hai 1 ano |
jaime-m-p
|
213701b51a
Detokenizer fixes (#8039)
|
hai 1 ano |
Douglas Hanley
|
d12f781074
llama : streamline embeddings from "non-embedding" models (#8087)
|
hai 1 ano |
Xuan Son Nguyen
|
a38b884c6c
cli: add EOT when user hit Ctrl+C (#8296)
|
hai 1 ano |
fairydreaming
|
807b0c49ff
Inference support for T5 and FLAN-T5 model families (#5763)
|
hai 1 ano |
MistApproach
|
a27152b602
fix: add missing short command line argument -mli for multiline-input (#8261)
|
hai 1 ano |
Xuan Son Nguyen
|
9ef0780062
Fix new line issue with chat template, disable template when in-prefix/suffix is set (#8203)
|
hai 1 ano |
Sigbjørn Skjæret
|
38373cfbab
Add SPM infill support (#8016)
|
hai 1 ano |
Xuan Son Nguyen
|
16791b8f0b
Add chatml fallback for cpp `llama_chat_apply_template` (#8160)
|
hai 1 ano |
jukofyork
|
97877eb10b
Control vector loading fixes (#8137)
|
hai 1 ano |
Xuan Son Nguyen
|
49c03c79cd
cvector: better prompt handling, add "mean vector" method (#8069)
|
hai 1 ano |
Xuan Son Nguyen
|
48e6b92cc3
Add chat template support for llama-cli (#8068)
|
hai 1 ano |
HatsuneMikuUwU33
|
f702a90e24
Update control vector help (#8104)
|
hai 1 ano |
Yann Follet
|
646ef4a9cf
embedding : more cli arguments (#7458)
|
hai 1 ano |
Xuan Son Nguyen
|
3e58b0ee35
cvector: fix CI + correct help message (#8064)
|
hai 1 ano |
Douglas Hanley
|
80ea089d77
llama : allow pooled embeddings on any model (#7477)
|
hai 1 ano |