Johannes Gäßler
|
e789095502
llama: print memory breakdown on exit (#15860)
|
vor 3 Monaten |
Georgi Gerganov
|
e92d53b29e
sampling : optimize samplers by reusing bucket sort (#15665)
|
vor 4 Monaten |
Olivier Chafik
|
f5cd27b71d
`server`: streaming of tool calls and thoughts when `--jinja` is on (#12379)
|
vor 7 Monaten |
Ycros
|
39e73ae0d6
common : Add a warning when we can't match samplers from a string or char. (#13330)
|
vor 8 Monaten |
oobabooga
|
233461f812
sampling : Integrate Top-nσ into main sampling chain (and add it to the server) (#13264)
|
vor 8 Monaten |
Johannes Gäßler
|
dd373dd3bf
llama: fix error on bad grammar (#12628)
|
vor 9 Monaten |
Olivier Chafik
|
669912d9a5
`tool-call`: fix Qwen 2.5 Coder support, add micro benchmarks, support trigger patterns for lazy grammars (#12034)
|
vor 10 Monaten |
mgroeber9110
|
5bbe6a9fe9
ggml : portability fixes for VS 2017 (#12150)
|
vor 10 Monaten |
Olivier Chafik
|
c7f460ab88
`server`: fix tool-call of DeepSeek R1 Qwen, return reasoning_content (Command 7RB & DeepSeek R1) unless `--reasoning-format none` (#11607)
|
vor 11 Monaten |
Vinesh Janarthanan
|
27e8a23300
sampling: add Top-nσ sampler (#11223)
|
vor 11 Monaten |
Michał Moskal
|
ff227703d6
sampling : support for llguidance grammars (#10224)
|
vor 11 Monaten |
Olivier Chafik
|
8b576b6c55
Tool call support (generic + native for Llama, Functionary, Hermes, Mistral, Firefunction, DeepSeek) w/ lazy grammars (#9639)
|
vor 11 Monaten |
Georgi Gerganov
|
afa8a9ec9b
llama : add `llama_vocab`, functions -> methods, naming (#11110)
|
vor 1 Jahr |
Georgi Gerganov
|
644fd71b44
sampling : refactor + optimize penalties sampler (#10803)
|
vor 1 Jahr |
Georgi Gerganov
|
d9d54e498d
speculative : refactor and add a simpler example (#10362)
|
vor 1 Jahr |
Georgi Gerganov
|
8d8ff71536
llama : remove Tail-Free sampling (#10071)
|
vor 1 Jahr |
wwoodsTM
|
ff252ea48e
llama : add DRY sampler (#9702)
|
vor 1 Jahr |
Georgi Gerganov
|
55e47786e3
llama : default sampling changes + greedy update (#9897)
|
vor 1 Jahr |
Georgi Gerganov
|
755a9b2bf0
llama : add infill sampler (#9896)
|
vor 1 Jahr |
MaggotHATE
|
fbc98b748e
sampling : add XTC sampler (#9742)
|
vor 1 Jahr |
Diego Devesa
|
7eee341bee
common : use common_ prefix for common library functions (#9805)
|
vor 1 Jahr |
Georgi Gerganov
|
b0f27361f3
sampling : avoid expensive softmax during greedy sampling (#9605)
|
vor 1 Jahr |
Georgi Gerganov
|
6262d13e0b
common : reimplement logging (#9418)
|
vor 1 Jahr |
Georgi Gerganov
|
0abc6a2c25
llama : llama_perf + option to disable timings during decode (#9355)
|
vor 1 Jahr |
slaren
|
49006c67b4
llama : move random seed generation to the samplers (#9398)
|
vor 1 Jahr |
Xuan Son Nguyen
|
bfe76d4a17
common : move arg parser code to `arg.cpp` (#9388)
|
vor 1 Jahr |
Georgi Gerganov
|
f12295b8a9
llama : fix empty ring buffer push (#9358)
|
vor 1 Jahr |
Georgi Gerganov
|
df270ef745
llama : refactor sampling v2 (#9294)
|
vor 1 Jahr |
Georgi Gerganov
|
938943cdbf
llama : move vocab, grammar and sampling into separate files (#8508)
|
vor 1 Jahr |
Kevin Wang
|
470939d483
common : preallocate sampling token data vector (#8363)
|
vor 1 Jahr |