Georgi Gerganov
|
4301e27319
common : restore grammar-based rejection sampling (#18137)
|
1 hónapja |
Georgi Gerganov
|
254098a279
common : refactor common_sampler + grammar logic changes (#17937)
|
1 hónapja |
Georgi Gerganov
|
e92d53b29e
sampling : optimize samplers by reusing bucket sort (#15665)
|
4 hónapja |
Copilot
|
d8914fc47e
common : add --override-tensor-draft, --cpu-moe-draft and --n-cpu-moe-draft parameters (#15191)
|
5 hónapja |
Georgi Gerganov
|
745aa5319b
llama : deprecate llama_kv_self_ API (#14030)
|
7 hónapja |
Xuan-Son Nguyen
|
267c1399f1
common : refactor downloading system, handle mmproj with -hf option (#12694)
|
9 hónapja |
Georgi Gerganov
|
c6af2161b2
speculative : fix seg fault in certain cases (#12454)
|
10 hónapja |
Georgi Gerganov
|
e0dbec0bc6
llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181)
|
10 hónapja |
Georgi Gerganov
|
afa8a9ec9b
llama : add `llama_vocab`, functions -> methods, naming (#11110)
|
1 éve |
Georgi Gerganov
|
f66f582927
llama : refactor `src/llama.cpp` (#10902)
|
1 éve |
Diego Devesa
|
10bce0450f
llama : accept a list of devices to use to offload a model (#10497)
|
1 éve |
Georgi Gerganov
|
d9d54e498d
speculative : refactor and add a simpler example (#10362)
|
1 éve |
Georgi Gerganov
|
2a82891a85
speculative : fix out-of-bounds access (#10289)
|
1 éve |
Georgi Gerganov
|
55e47786e3
llama : default sampling changes + greedy update (#9897)
|
1 éve |
Georgi Gerganov
|
bc21975084
speculative : fix handling of some input params (#9963)
|
1 éve |
Xuan Son Nguyen
|
cda0e4b648
llama : remove all_pos_0, all_pos_1, all_seq_id from llama_batch (#9745)
|
1 éve |
Diego Devesa
|
7eee341bee
common : use common_ prefix for common library functions (#9805)
|
1 éve |
Georgi Gerganov
|
b0f27361f3
sampling : avoid expensive softmax during greedy sampling (#9605)
|
1 éve |
Georgi Gerganov
|
6262d13e0b
common : reimplement logging (#9418)
|
1 éve |
Georgi Gerganov
|
0abc6a2c25
llama : llama_perf + option to disable timings during decode (#9355)
|
1 éve |
Xuan Son Nguyen
|
bfe76d4a17
common : move arg parser code to `arg.cpp` (#9388)
|
1 éve |
Xuan Son Nguyen
|
1b9ae5189c
common : refactor arg parser (#9308)
|
1 éve |
Georgi Gerganov
|
df270ef745
llama : refactor sampling v2 (#9294)
|
1 éve |
Faisal Zaghloul
|
42c76d1358
Threadpool: take 2 (#8672)
|
1 éve |
Liu Jia
|
0a4ce78681
common : Changed tuple to struct (TODO fix) (#8823)
|
1 éve |
Georgi Gerganov
|
1442677f92
common : refactor cli arg parsing (#7675)
|
1 éve |
Pedro Cuenca
|
b97bc3966e
llama : support Llama 3 HF conversion (#6745)
|
1 éve |
Jared Van Bortel
|
1b67731e18
BERT tokenizer fixes (#6498)
|
1 éve |
compilade
|
557410b8f0
llama : greatly reduce output buffer memory usage (#6122)
|
1 éve |
Minsoo Cheong
|
586e7bc561
sampling : deduplicated code for probability distribution access (#6240)
|
1 éve |