Georgi Gerganov
|
55e47786e3
llama : default sampling changes + greedy update (#9897)
|
1 year ago |
Xuan Son Nguyen
|
cda0e4b648
llama : remove all_pos_0, all_pos_1, all_seq_id from llama_batch (#9745)
|
1 year ago |
Georgi Gerganov
|
99bd4ac28c
llama : infill sampling handle very long tokens (#9924)
|
1 year ago |
Georgi Gerganov
|
755a9b2bf0
llama : add infill sampler (#9896)
|
1 year ago |
MaggotHATE
|
fbc98b748e
sampling : add XTC sampler (#9742)
|
1 year ago |
Georgi Gerganov
|
11ac9800af
llama : improve infill support and special token detection (#9798)
|
1 year ago |
Diego Devesa
|
0e9f760eb1
rpc : add backend registry / device interfaces (#9812)
|
1 year ago |
Georgi Gerganov
|
f4d2b8846a
llama : add reranking support (#9510)
|
1 year ago |
Georgi Gerganov
|
739842703e
llama : add comment about thread-safety [no ci] (#9449)
|
1 year ago |
nopperl
|
9a913110cf
llama : add support for Chameleon (#8543)
|
1 year ago |
Georgi Gerganov
|
b0f27361f3
sampling : avoid expensive softmax during greedy sampling (#9605)
|
1 year ago |
Michael Podvitskiy
|
37f3a3810e
llama : add llama_n_head() (#9512)
|
1 year ago |
Georgi Gerganov
|
0abc6a2c25
llama : llama_perf + option to disable timings during decode (#9355)
|
1 year ago |
Gilad S.
|
bd35cb0ae3
feat: remove a sampler from a chain (#9445)
|
1 year ago |
slaren
|
49006c67b4
llama : move random seed generation to the samplers (#9398)
|
1 year ago |
slaren
|
5fb5e24811
llama : minor sampling refactor (2) (#9386)
|
1 year ago |
Georgi Gerganov
|
df270ef745
llama : refactor sampling v2 (#9294)
|
1 year ago |
compilade
|
9bc6db28d0
ggml-quants : ternary packing for TriLMs and BitNet b1.58 (#8151)
|
1 year ago |
Molly Sophia
|
8f1d81a0b6
llama : support RWKV v6 models (#8980)
|
1 year ago |
Sutou Kouhei
|
0ab30f8d82
llama : fix llama_split_mode enum values in main_gpu document (#9057)
|
1 year ago |
Faisal Zaghloul
|
42c76d1358
Threadpool: take 2 (#8672)
|
1 year ago |
compilade
|
a1631e53f6
llama : simplify Mamba with advanced batch splits (#8526)
|
1 year ago |
Minsoo Cheong
|
c679e0cb5c
llama : add EXAONE model support (#9025)
|
1 year ago |
Zhenwei Jin
|
4af8420afb
common : remove duplicate function llama_should_add_bos_token (#8778)
|
1 year ago |
Esko Toivonen
|
6bda7ce6c3
llama : add pre-tokenizer regexes for BLOOM and gpt3-finnish (#8850)
|
1 year ago |
Daniel Bevenius
|
06943a69f6
ggml : move rope type enum to ggml.h (#8949)
|
1 year ago |
fairydreaming
|
7c3f55c100
Add support for encoder-only T5 models (#8900)
|
1 year ago |
Nexes the Old
|
31958546c3
typo correction (#8891)
|
1 year ago |
compilade
|
4c676c85e5
llama : refactor session file management (#8699)
|
1 year ago |
Xuan Son Nguyen
|
b115105f05
add llama_lora_adapter_clear (#8653)
|
1 year ago |