Xuan Son Nguyen
|
9f912511bc
common : fix duplicated file name with hf_repo and hf_file (#10550)
|
пре 1 година |
Diego Devesa
|
10bce0450f
llama : accept a list of devices to use to offload a model (#10497)
|
пре 1 година |
Georgi Gerganov
|
d9d54e498d
speculative : refactor and add a simpler example (#10362)
|
пре 1 година |
Johannes Gäßler
|
4e54be0ec6
llama/ex: remove --logdir argument (#10339)
|
пре 1 година |
Georgi Gerganov
|
b141e5f6ef
server : enable KV cache defrag by default (#10233)
|
пре 1 година |
Georgi Gerganov
|
1926d6e39d
llama : adjust default context size + print warnings (#10136)
|
пре 1 година |
Georgi Gerganov
|
8d8ff71536
llama : remove Tail-Free sampling (#10071)
|
пре 1 година |
wwoodsTM
|
ff252ea48e
llama : add DRY sampler (#9702)
|
пре 1 година |
Michael Podvitskiy
|
d80fb71f8b
llama: string_split fix (#10022)
|
пре 1 година |
Daniel Bevenius
|
674804a996
arg : fix typo in embeddings argument help [no ci] (#9994)
|
пре 1 година |
Georgi Gerganov
|
755a9b2bf0
llama : add infill sampler (#9896)
|
пре 1 година |
MaggotHATE
|
fbc98b748e
sampling : add XTC sampler (#9742)
|
пре 1 година |
Georgi Gerganov
|
c7181bd294
server : reuse cached context chunks (#9866)
|
пре 1 година |
Georgi Gerganov
|
95c76e8e92
server : remove legacy system_prompt feature (#9857)
|
пре 1 година |
Georgi Gerganov
|
11ac9800af
llama : improve infill support and special token detection (#9798)
|
пре 1 година |
Diego Devesa
|
7eee341bee
common : use common_ prefix for common library functions (#9805)
|
пре 1 година |
Xuan Son Nguyen
|
458367a906
server : better security control for public deployments (#9776)
|
пре 1 година |
Georgi Gerganov
|
f4d2b8846a
llama : add reranking support (#9510)
|
пре 1 година |
Vinesh Janarthanan
|
441b72b91f
main : option to disable context shift (#9484)
|
пре 1 година |
Georgi Gerganov
|
6262d13e0b
common : reimplement logging (#9418)
|
пре 1 година |
Georgi Gerganov
|
0abc6a2c25
llama : llama_perf + option to disable timings during decode (#9355)
|
пре 1 година |
Xuan Son Nguyen
|
bfe76d4a17
common : move arg parser code to `arg.cpp` (#9388)
|
пре 1 година |
Xuan Son Nguyen
|
3f7ccfd649
common : bring back missing args, add env var duplication check (#9375)
|
пре 1 година |
Xuan Son Nguyen
|
1b9ae5189c
common : refactor arg parser (#9308)
|
пре 1 година |
Georgi Gerganov
|
df270ef745
llama : refactor sampling v2 (#9294)
|
пре 1 година |
Aarni Koskela
|
815b1fb20a
batched-bench : add `--output-format jsonl` option (#9293)
|
пре 1 година |
Faisal Zaghloul
|
42c76d1358
Threadpool: take 2 (#8672)
|
пре 1 година |
Xuan Son Nguyen
|
fc54ef0d1c
server : support reading arguments from environment variables (#9105)
|
пре 1 година |
Zhenwei Jin
|
4af8420afb
common : remove duplicate function llama_should_add_bos_token (#8778)
|
пре 1 година |
Georgi Gerganov
|
45a55b91aa
llama : better replace_all (cont) (#8926)
|
пре 1 година |