Georgi Gerganov
|
6262d13e0b
common : reimplement logging (#9418)
|
1 год назад |
Georgi Gerganov
|
0abc6a2c25
llama : llama_perf + option to disable timings during decode (#9355)
|
1 год назад |
Xuan Son Nguyen
|
bfe76d4a17
common : move arg parser code to `arg.cpp` (#9388)
|
1 год назад |
Xuan Son Nguyen
|
1b9ae5189c
common : refactor arg parser (#9308)
|
1 год назад |
Georgi Gerganov
|
df270ef745
llama : refactor sampling v2 (#9294)
|
1 год назад |
Faisal Zaghloul
|
42c76d1358
Threadpool: take 2 (#8672)
|
1 год назад |
Liu Jia
|
0a4ce78681
common : Changed tuple to struct (TODO fix) (#8823)
|
1 год назад |
Georgi Gerganov
|
1442677f92
common : refactor cli arg parsing (#7675)
|
1 год назад |
Pedro Cuenca
|
b97bc3966e
llama : support Llama 3 HF conversion (#6745)
|
1 год назад |
Jared Van Bortel
|
1b67731e18
BERT tokenizer fixes (#6498)
|
1 год назад |
compilade
|
557410b8f0
llama : greatly reduce output buffer memory usage (#6122)
|
1 год назад |
Minsoo Cheong
|
586e7bc561
sampling : deduplicated code for probability distribution access (#6240)
|
1 год назад |
Jeffrey Quesnelle
|
29eee40474
fix speculative decoding build on windows (#5874)
|
1 год назад |
Minsoo Cheong
|
6d341ab6c5
speculative : implement stochastic speculative sampling (#5625)
|
1 год назад |
bmwl
|
f486f6e1e5
ggml : add numa options (#5377)
|
1 год назад |
stduhpf
|
e0324285a5
speculative : threading options (#4959)
|
2 лет назад |
Richard Kiss
|
9494d7c477
english : use `typos` to fix comments and logs (#4354)
|
2 лет назад |
stduhpf
|
da5eaef1f3
speculative : support `--color` (#4343)
|
2 лет назад |
Branden Butler
|
40a34fe8d0
speculative : fix prompt tokenization in speculative example (#4025)
|
2 лет назад |
Georgi Gerganov
|
8f961abdc4
speculative : change default p_accept to 0.5 + CLI args (#3919)
|
2 лет назад |
cebtenzzre
|
b12fa0d1c1
build : link against build info instead of compiling against it (#3879)
|
2 лет назад |
Georgi Gerganov
|
ee1a0ec9cb
llama : add option for greedy sampling with probs (#3813)
|
2 лет назад |
Kerfuffle
|
41aee4df82
speculative : ensure draft and target model vocab matches (#3812)
|
2 лет назад |
Marcus Dunn
|
5be6c803fa
llama : remove token functions with `context` args in favor of `model` (#3720)
|
2 лет назад |
Georgi Gerganov
|
d1031cf49c
sampling : refactor init to use llama_sampling_params (#3696)
|
2 лет назад |
Georgi Gerganov
|
4e82b2ea3f
speculative : bug fixes
|
2 лет назад |
Georgi Gerganov
|
0e89203b51
speculative : add tree-based sampling example (#3624)
|
2 лет назад |
Kerfuffle
|
70c29da118
common : fix mirostat state when using multiple sequences (#3543)
|
2 лет назад |
Georgi Gerganov
|
ac2219fef3
llama : fix session saving/loading (#3400)
|
2 лет назад |
slaren
|
16bc66d947
llama.cpp : split llama_context_params into model and context params (#3301)
|
2 лет назад |