Diego Devesa
|
10bce0450f
llama : accept a list of devices to use to offload a model (#10497)
|
пре 1 година |
Georgi Gerganov
|
8e752a777b
llama : add check for KV cache shifts (#10401)
|
пре 1 година |
Johannes Gäßler
|
4e54be0ec6
llama/ex: remove --logdir argument (#10339)
|
пре 1 година |
Michael Podvitskiy
|
fb4a0ec083
llama : propagate the results of `graph_compute` (#9525)
|
пре 1 година |
Diego Devesa
|
9f40989351
ggml : move CPU backend to a separate file (#10144)
|
пре 1 година |
Diego Devesa
|
c5b0f4b5d9
llama : refactor model loader with backend registry (#10026)
|
пре 1 година |
Georgi Gerganov
|
8d8ff71536
llama : remove Tail-Free sampling (#10071)
|
пре 1 година |
wwoodsTM
|
ff252ea48e
llama : add DRY sampler (#9702)
|
пре 1 година |
Georgi Gerganov
|
55e47786e3
llama : default sampling changes + greedy update (#9897)
|
пре 1 година |
Xuan Son Nguyen
|
cda0e4b648
llama : remove all_pos_0, all_pos_1, all_seq_id from llama_batch (#9745)
|
пре 1 година |
Georgi Gerganov
|
99bd4ac28c
llama : infill sampling handle very long tokens (#9924)
|
пре 1 година |
Georgi Gerganov
|
755a9b2bf0
llama : add infill sampler (#9896)
|
пре 1 година |
MaggotHATE
|
fbc98b748e
sampling : add XTC sampler (#9742)
|
пре 1 година |
Georgi Gerganov
|
11ac9800af
llama : improve infill support and special token detection (#9798)
|
пре 1 година |
Diego Devesa
|
0e9f760eb1
rpc : add backend registry / device interfaces (#9812)
|
пре 1 година |
Georgi Gerganov
|
f4d2b8846a
llama : add reranking support (#9510)
|
пре 1 година |
Georgi Gerganov
|
739842703e
llama : add comment about thread-safety [no ci] (#9449)
|
пре 1 година |
nopperl
|
9a913110cf
llama : add support for Chameleon (#8543)
|
пре 1 година |
Georgi Gerganov
|
b0f27361f3
sampling : avoid expensive softmax during greedy sampling (#9605)
|
пре 1 година |
Michael Podvitskiy
|
37f3a3810e
llama : add llama_n_head() (#9512)
|
пре 1 година |
Georgi Gerganov
|
0abc6a2c25
llama : llama_perf + option to disable timings during decode (#9355)
|
пре 1 година |
Gilad S.
|
bd35cb0ae3
feat: remove a sampler from a chain (#9445)
|
пре 1 година |
slaren
|
49006c67b4
llama : move random seed generation to the samplers (#9398)
|
пре 1 година |
slaren
|
5fb5e24811
llama : minor sampling refactor (2) (#9386)
|
пре 1 година |
Georgi Gerganov
|
df270ef745
llama : refactor sampling v2 (#9294)
|
пре 1 година |
compilade
|
9bc6db28d0
ggml-quants : ternary packing for TriLMs and BitNet b1.58 (#8151)
|
пре 1 година |
Molly Sophia
|
8f1d81a0b6
llama : support RWKV v6 models (#8980)
|
пре 1 година |
Sutou Kouhei
|
0ab30f8d82
llama : fix llama_split_mode enum values in main_gpu document (#9057)
|
пре 1 година |
Faisal Zaghloul
|
42c76d1358
Threadpool: take 2 (#8672)
|
пре 1 година |
compilade
|
a1631e53f6
llama : simplify Mamba with advanced batch splits (#8526)
|
пре 1 година |