Georgi Gerganov
|
afa8a9ec9b
llama : add `llama_vocab`, functions -> methods, naming (#11110)
|
1 vuosi sitten |
Georgi Gerganov
|
e6e7c75d94
server : fix extra BOS in infill endpoint (#11106)
|
1 vuosi sitten |
Georgi Gerganov
|
f66f582927
llama : refactor `src/llama.cpp` (#10902)
|
1 vuosi sitten |
Xuan Son Nguyen
|
0da5d86026
server : allow using LoRA adapters per-request (#10994)
|
1 vuosi sitten |
Xuan Son Nguyen
|
45095a61bf
server : clean up built-in template detection (#11026)
|
1 vuosi sitten |
Xuan Son Nguyen
|
5896c65232
server : add OAI compat for /v1/completions (#10974)
|
1 vuosi sitten |
Alexey Parfenov
|
16cdce7b68
server : fix token duplication when streaming with stop strings (#10997)
|
1 vuosi sitten |
Reza Kakhki
|
9ba399dfa7
server : add support for "encoding_format": "base64" to the */embeddings endpoints (#10967)
|
1 vuosi sitten |
NeverLucky
|
09fe2e7613
server: allow filtering llama server response fields (#10940)
|
1 vuosi sitten |
Xuan Son Nguyen
|
14b699ecde
server : fix missing model id in /model endpoint (#10957)
|
1 vuosi sitten |
Xuan Son Nguyen
|
485dc01214
server : add system_fingerprint to chat/completion (#10917)
|
1 vuosi sitten |
Xuan Son Nguyen
|
57bb2c40cd
server : fix logprobs, make it OAI-compatible (#10783)
|
1 vuosi sitten |
Georgi Gerganov
|
152610eda9
server : output embeddings for all tokens when pooling = none (#10861)
|
1 vuosi sitten |
Georgi Gerganov
|
0e70ba686e
server : add "tokens" output (#10853)
|
1 vuosi sitten |
Xuan Son Nguyen
|
46828872c3
server : (embeddings) using same format for "input" and "content" (#10872)
|
1 vuosi sitten |
krystiancha
|
05c3a444b8
server : fill usage info in embeddings and rerank responses (#10852)
|
1 vuosi sitten |
Georgi Gerganov
|
644fd71b44
sampling : refactor + optimize penalties sampler (#10803)
|
1 vuosi sitten |
Vinesh Janarthanan
|
5478bbcd17
server: (UI) add syntax highlighting and latex math rendering (#10808)
|
1 vuosi sitten |
Michelle Tan
|
89d604f2c8
server: Fix `has_next_line` in JSON response (#10818)
|
1 vuosi sitten |
cduk
|
56eea0781c
Removes spurious \r in output that causes logging in journalctl to treat lines as binary and therefore hidden by default (#10771)
|
1 vuosi sitten |
Yüg
|
a86ad841f1
server : add flag to disable the web-ui (#10762) (#10751)
|
1 vuosi sitten |
Xuan Son Nguyen
|
ce8784bdb1
server : fix format_infill (#10724)
|
1 vuosi sitten |
Xuan Son Nguyen
|
e52522b869
server : bring back info of final chunk in stream mode (#10722)
|
1 vuosi sitten |
Xuan Son Nguyen
|
3573fa8e7b
server : (refactor) no more json in server_task input (#10691)
|
1 vuosi sitten |
Georgi Gerganov
|
ce4a7b8493
server : various fixes (#10704)
|
1 vuosi sitten |
Georgi Gerganov
|
c2a16c0bdb
server : fix free of spec context and batch (#10651)
|
1 vuosi sitten |
Xuan Son Nguyen
|
6c5bc0625f
server : (refactoring) do not rely on JSON internally (#10643)
|
1 vuosi sitten |
Georgi Gerganov
|
1da7b76569
server : fix speculative decoding with context shift (#10641)
|
1 vuosi sitten |
Xuan Son Nguyen
|
91c36c269b
server : (web ui) Various improvements, now use vite as bundler (#10599)
|
1 vuosi sitten |
Georgi Gerganov
|
70b98fadbc
server : fix default draft model parameters (#10586)
|
1 vuosi sitten |