Olivier Chafik
|
f5cd27b71d
`server`: streaming of tool calls and thoughts when `--jinja` is on (#12379)
|
7 months ago |
Olivier Chafik
|
669912d9a5
`tool-call`: fix Qwen 2.5 Coder support, add micro benchmarks, support trigger patterns for lazy grammars (#12034)
|
10 months ago |
Daniel Bevenius
|
9626d9351a
llama : fix indentation in llama-grammar [no ci] (#11943)
|
11 months ago |
Olivier Chafik
|
c7f460ab88
`server`: fix tool-call of DeepSeek R1 Qwen, return reasoning_content (Command 7RB & DeepSeek R1) unless `--reasoning-format none` (#11607)
|
11 months ago |
Olivier Chafik
|
90f9b88afb
nit: more informative crash when grammar sampler fails (#11593)
|
11 months ago |
Olivier Chafik
|
8b576b6c55
Tool call support (generic + native for Llama, Functionary, Hermes, Mistral, Firefunction, DeepSeek) w/ lazy grammars (#9639)
|
11 months ago |
Georgi Gerganov
|
afa8a9ec9b
llama : add `llama_vocab`, functions -> methods, naming (#11110)
|
1 year ago |
Georgi Gerganov
|
f66f582927
llama : refactor `src/llama.cpp` (#10902)
|
1 year ago |
Georgi Gerganov
|
5cab3e4aaa
llama : minor grammar refactor (#10897)
|
1 year ago |
Georgi Gerganov
|
df270ef745
llama : refactor sampling v2 (#9294)
|
1 year ago |
slaren
|
2b1f616b20
ggml : reduce hash table reset cost (#8698)
|
1 year ago |
Georgi Gerganov
|
938943cdbf
llama : move vocab, grammar and sampling into separate files (#8508)
|
1 year ago |