Johannes Gäßler
|
890719311b
common: fix warning message when no GPU found (#10564)
|
1 year ago |
Xuan Son Nguyen
|
9f912511bc
common : fix duplicated file name with hf_repo and hf_file (#10550)
|
1 year ago |
Diego Devesa
|
10bce0450f
llama : accept a list of devices to use to offload a model (#10497)
|
1 year ago |
Georgi Gerganov
|
d9d54e498d
speculative : refactor and add a simpler example (#10362)
|
1 year ago |
Johannes Gäßler
|
4e54be0ec6
llama/ex: remove --logdir argument (#10339)
|
1 year ago |
Georgi Gerganov
|
8d8ff71536
llama : remove Tail-Free sampling (#10071)
|
1 year ago |
wwoodsTM
|
ff252ea48e
llama : add DRY sampler (#9702)
|
1 year ago |
Michael Podvitskiy
|
d80fb71f8b
llama: string_split fix (#10022)
|
1 year ago |
Daniel Bevenius
|
674804a996
arg : fix typo in embeddings argument help [no ci] (#9994)
|
1 year ago |
Daniel Bevenius
|
94008cc760
arg : fix attention non-causal arg value hint (#9985)
|
1 year ago |
MaggotHATE
|
fbc98b748e
sampling : add XTC sampler (#9742)
|
1 year ago |
Georgi Gerganov
|
c7181bd294
server : reuse cached context chunks (#9866)
|
1 year ago |
Georgi Gerganov
|
1bde94dd02
server : remove self-extend features (#9860)
|
1 year ago |
Georgi Gerganov
|
95c76e8e92
server : remove legacy system_prompt feature (#9857)
|
1 year ago |
Georgi Gerganov
|
11ac9800af
llama : improve infill support and special token detection (#9798)
|
1 year ago |
Diego Devesa
|
7eee341bee
common : use common_ prefix for common library functions (#9805)
|
1 year ago |
Diego Devesa
|
0e9f760eb1
rpc : add backend registry / device interfaces (#9812)
|
1 year ago |
Xuan Son Nguyen
|
458367a906
server : better security control for public deployments (#9776)
|
1 year ago |
Daniel Kleine
|
133c7b46b3
Fixed RNG seed docs (#9723)
|
1 year ago |
Georgi Gerganov
|
f4d2b8846a
llama : add reranking support (#9510)
|
1 year ago |
Xuan Son Nguyen
|
afbbfaa537
server : add more env vars, improve gen-docs (#9635)
|
1 year ago |
Xuan Son Nguyen
|
0b3bf966f4
server : add --no-context-shift option (#9607)
|
1 year ago |
Bert Wagner
|
8b836ae731
arg : add env variable for parallel (#9513)
|
1 year ago |
Vinesh Janarthanan
|
441b72b91f
main : option to disable context shift (#9484)
|
1 year ago |
Georgi Gerganov
|
6262d13e0b
common : reimplement logging (#9418)
|
1 year ago |
Georgi Gerganov
|
0abc6a2c25
llama : llama_perf + option to disable timings during decode (#9355)
|
1 year ago |
Xuan Son Nguyen
|
6cd4e03444
arg : bring back missing ifdef (#9411)
|
1 year ago |
matteo
|
8d300bd35f
enable --special arg for llama-server (#9419)
|
1 year ago |
slaren
|
49006c67b4
llama : move random seed generation to the samplers (#9398)
|
1 year ago |
Xuan Son Nguyen
|
bfe76d4a17
common : move arg parser code to `arg.cpp` (#9388)
|
1 year ago |