Kevin Wang
|
ffd00797d8
common : avoid unnecessary logits fetch (#8358)
|
1 anno fa |
Daniel Bevenius
|
e6bf007744
llama : return nullptr from llama_grammar_init (#8093)
|
1 anno fa |
Georgi Gerganov
|
6ff13987ad
common : normalize naming style (#7462)
|
1 anno fa |
Olivier Chafik
|
e402de364b
`grammars`: fix resampling logic regression (#7424)
|
1 anno fa |
Johannes Gäßler
|
5ae3426b0b
server: fix reported top tokens for temperature 0 (#7203)
|
1 anno fa |
Johannes Gäßler
|
af0a5b6163
server: fix incorrectly reported token probabilities (#7125)
|
1 anno fa |
David Renshaw
|
3f167476b1
sampling : use std::random_device{}() for default random seed (#6962)
|
1 anno fa |
Johannes Gäßler
|
28103f4832
Server: fix seed for multiple slots (#6835)
|
1 anno fa |
Minsoo Cheong
|
586e7bc561
sampling : deduplicated code for probability distribution access (#6240)
|
1 anno fa |
Clint Herron
|
463628372d
grammar : handle missing "root" node (#6004)
|
1 anno fa |
Minsoo Cheong
|
6d341ab6c5
speculative : implement stochastic speculative sampling (#5625)
|
1 anno fa |
Pierrick Hymbert
|
e3965cf35a
server: tests - slow inference causes timeout on the CI (#5715)
|
1 anno fa |
Robey Holderith
|
5ee99c32f5
common, server : surface min_keep as its own parameter (#5567)
|
1 anno fa |
Georgi Gerganov
|
689a091bbe
sampling : do not set min_keep to n_probs (#5564)
|
1 anno fa |
Alexey Parfenov
|
6dcc02d244
server : add "samplers" param to control the samplers order (#5494)
|
1 anno fa |
Alexey Parfenov
|
a803333a4e
common : use enums for sampler types (#5418)
|
1 anno fa |
Georgi Gerganov
|
139b62a839
common : fix compile warning
|
1 anno fa |
Johannes Gäßler
|
26d4efd11e
sampling: fix top_k <= 0 (#5388)
|
1 anno fa |
Michael Klimenko
|
35a2ee9143
Remove unused data and add fixes (#5154)
|
2 anni fa |
l3utterfly
|
5eaf9964fc
llama : dynamic temperature sampling (#4972)
|
2 anni fa |
David Friehs
|
4483396751
llama : apply classifier-free guidance to logits directly (#4951)
|
2 anni fa |
Alexey Parfenov
|
6123979952
server : allow to specify custom prompt for penalty calculation (#3727)
|
2 anni fa |
kalomaze
|
b9ec82d262
grammar : check the full vocab only if necessary (opt) (#4306)
|
2 anni fa |
Georgi Gerganov
|
caa9249217
common : fix compile warning
|
2 anni fa |
MaggotHATE
|
52c8bc3cf3
sampling : custom samplers order (#4285)
|
2 anni fa |
l3utterfly
|
e75dfdd31b
sampling : null grammar field after reset (#3885)
|
2 anni fa |
kalomaze
|
238657db23
samplers : Min-P sampler implementation [alternative to Top P/Top K] (#3841)
|
2 anni fa |
Georgi Gerganov
|
ee1a0ec9cb
llama : add option for greedy sampling with probs (#3813)
|
2 anni fa |
Marcus Dunn
|
5be6c803fa
llama : remove token functions with `context` args in favor of `model` (#3720)
|
2 anni fa |
Georgi Gerganov
|
d1031cf49c
sampling : refactor init to use llama_sampling_params (#3696)
|
2 anni fa |