Georgi Gerganov
|
4301e27319
common : restore grammar-based rejection sampling (#18137)
|
1 mês atrás |
Georgi Gerganov
|
254098a279
common : refactor common_sampler + grammar logic changes (#17937)
|
1 mês atrás |
Georgi Gerganov
|
196f5083ef
common : more accurate sampling timing (#17382)
|
1 mês atrás |
Johannes Gäßler
|
e789095502
llama: print memory breakdown on exit (#15860)
|
3 meses atrás |
Georgi Gerganov
|
e92d53b29e
sampling : optimize samplers by reusing bucket sort (#15665)
|
4 meses atrás |
Olivier Chafik
|
f5cd27b71d
`server`: streaming of tool calls and thoughts when `--jinja` is on (#12379)
|
7 meses atrás |
Ycros
|
39e73ae0d6
common : Add a warning when we can't match samplers from a string or char. (#13330)
|
8 meses atrás |
oobabooga
|
233461f812
sampling : Integrate Top-nσ into main sampling chain (and add it to the server) (#13264)
|
8 meses atrás |
Johannes Gäßler
|
dd373dd3bf
llama: fix error on bad grammar (#12628)
|
9 meses atrás |
Olivier Chafik
|
669912d9a5
`tool-call`: fix Qwen 2.5 Coder support, add micro benchmarks, support trigger patterns for lazy grammars (#12034)
|
10 meses atrás |
mgroeber9110
|
5bbe6a9fe9
ggml : portability fixes for VS 2017 (#12150)
|
10 meses atrás |
Olivier Chafik
|
c7f460ab88
`server`: fix tool-call of DeepSeek R1 Qwen, return reasoning_content (Command 7RB & DeepSeek R1) unless `--reasoning-format none` (#11607)
|
11 meses atrás |
Vinesh Janarthanan
|
27e8a23300
sampling: add Top-nσ sampler (#11223)
|
11 meses atrás |
Michał Moskal
|
ff227703d6
sampling : support for llguidance grammars (#10224)
|
11 meses atrás |
Olivier Chafik
|
8b576b6c55
Tool call support (generic + native for Llama, Functionary, Hermes, Mistral, Firefunction, DeepSeek) w/ lazy grammars (#9639)
|
11 meses atrás |
Georgi Gerganov
|
afa8a9ec9b
llama : add `llama_vocab`, functions -> methods, naming (#11110)
|
1 ano atrás |
Georgi Gerganov
|
644fd71b44
sampling : refactor + optimize penalties sampler (#10803)
|
1 ano atrás |
Georgi Gerganov
|
d9d54e498d
speculative : refactor and add a simpler example (#10362)
|
1 ano atrás |
Georgi Gerganov
|
8d8ff71536
llama : remove Tail-Free sampling (#10071)
|
1 ano atrás |
wwoodsTM
|
ff252ea48e
llama : add DRY sampler (#9702)
|
1 ano atrás |
Georgi Gerganov
|
55e47786e3
llama : default sampling changes + greedy update (#9897)
|
1 ano atrás |
Georgi Gerganov
|
755a9b2bf0
llama : add infill sampler (#9896)
|
1 ano atrás |
MaggotHATE
|
fbc98b748e
sampling : add XTC sampler (#9742)
|
1 ano atrás |
Diego Devesa
|
7eee341bee
common : use common_ prefix for common library functions (#9805)
|
1 ano atrás |
Georgi Gerganov
|
b0f27361f3
sampling : avoid expensive softmax during greedy sampling (#9605)
|
1 ano atrás |
Georgi Gerganov
|
6262d13e0b
common : reimplement logging (#9418)
|
1 ano atrás |
Georgi Gerganov
|
0abc6a2c25
llama : llama_perf + option to disable timings during decode (#9355)
|
1 ano atrás |
slaren
|
49006c67b4
llama : move random seed generation to the samplers (#9398)
|
1 ano atrás |
Xuan Son Nguyen
|
bfe76d4a17
common : move arg parser code to `arg.cpp` (#9388)
|
1 ano atrás |
Georgi Gerganov
|
f12295b8a9
llama : fix empty ring buffer push (#9358)
|
1 ano atrás |