Jeffrey Quesnelle
|
29eee40474
fix speculative decoding build on windows (#5874)
|
1 年之前 |
Minsoo Cheong
|
6d341ab6c5
speculative : implement stochastic speculative sampling (#5625)
|
1 年之前 |
bmwl
|
f486f6e1e5
ggml : add numa options (#5377)
|
1 年之前 |
stduhpf
|
e0324285a5
speculative : threading options (#4959)
|
2 年之前 |
Richard Kiss
|
9494d7c477
english : use `typos` to fix comments and logs (#4354)
|
2 年之前 |
stduhpf
|
da5eaef1f3
speculative : support `--color` (#4343)
|
2 年之前 |
Branden Butler
|
40a34fe8d0
speculative : fix prompt tokenization in speculative example (#4025)
|
2 年之前 |
Georgi Gerganov
|
8f961abdc4
speculative : change default p_accept to 0.5 + CLI args (#3919)
|
2 年之前 |
cebtenzzre
|
b12fa0d1c1
build : link against build info instead of compiling against it (#3879)
|
2 年之前 |
Georgi Gerganov
|
ee1a0ec9cb
llama : add option for greedy sampling with probs (#3813)
|
2 年之前 |
Kerfuffle
|
41aee4df82
speculative : ensure draft and target model vocab matches (#3812)
|
2 年之前 |
Marcus Dunn
|
5be6c803fa
llama : remove token functions with `context` args in favor of `model` (#3720)
|
2 年之前 |
Georgi Gerganov
|
d1031cf49c
sampling : refactor init to use llama_sampling_params (#3696)
|
2 年之前 |
Georgi Gerganov
|
4e82b2ea3f
speculative : bug fixes
|
2 年之前 |
Georgi Gerganov
|
0e89203b51
speculative : add tree-based sampling example (#3624)
|
2 年之前 |
Kerfuffle
|
70c29da118
common : fix mirostat state when using multiple sequences (#3543)
|
2 年之前 |
Georgi Gerganov
|
ac2219fef3
llama : fix session saving/loading (#3400)
|
2 年之前 |
slaren
|
16bc66d947
llama.cpp : split llama_context_params into model and context params (#3301)
|
2 年之前 |
Georgi Gerganov
|
ec893798b7
llama : custom attention mask + parallel decoding + no context swaps (#3228)
|
2 年之前 |
Leng Yue
|
35f73049af
speculative : add heuristic algorithm (#3006)
|
2 年之前 |
FK
|
84e723653c
speculative: add --n-gpu-layers-draft option (#3063)
|
2 年之前 |
Przemysław Pawełczyk
|
cb6c44c5e0
build : do not use _GNU_SOURCE gratuitously (#2035)
|
2 年之前 |
Georgi Gerganov
|
921772104b
speculative : add grammar support (#2991)
|
2 年之前 |
Georgi Gerganov
|
47068e5170
speculative : PoC for speeding-up inference via speculative sampling (#2926)
|
2 年之前 |