Georgi Gerganov
|
4e82b2ea3f
speculative : bug fixes
|
2 éve |
Georgi Gerganov
|
0e89203b51
speculative : add tree-based sampling example (#3624)
|
2 éve |
Kerfuffle
|
70c29da118
common : fix mirostat state when using multiple sequences (#3543)
|
2 éve |
Georgi Gerganov
|
ac2219fef3
llama : fix session saving/loading (#3400)
|
2 éve |
slaren
|
16bc66d947
llama.cpp : split llama_context_params into model and context params (#3301)
|
2 éve |
Georgi Gerganov
|
ec893798b7
llama : custom attention mask + parallel decoding + no context swaps (#3228)
|
2 éve |
Leng Yue
|
35f73049af
speculative : add heuristic algorithm (#3006)
|
2 éve |
FK
|
84e723653c
speculative: add --n-gpu-layers-draft option (#3063)
|
2 éve |
Przemysław Pawełczyk
|
cb6c44c5e0
build : do not use _GNU_SOURCE gratuitously (#2035)
|
2 éve |
Georgi Gerganov
|
921772104b
speculative : add grammar support (#2991)
|
2 éve |
Georgi Gerganov
|
47068e5170
speculative : PoC for speeding-up inference via speculative sampling (#2926)
|
2 éve |