Bjarke Viksøe
|
cb4d86c4d7
server: Retrieve prompt template in /props (#8337)
|
1 rok temu |
Pieter Ouwerkerk
|
5a7447c569
readme : fix minor typos [no ci] (#8314)
|
1 rok temu |
Sigbjørn Skjæret
|
38373cfbab
Add SPM infill support (#8016)
|
1 rok temu |
Olivier Chafik
|
1c641e6aac
`build`: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809)
|
1 rok temu |
Johannes Gäßler
|
7027b27d76
server: update cache_prompt documentation [no ci] (#7745)
|
1 rok temu |
Johannes Gäßler
|
1b01f06db0
server: add test for token probs (#7347)
|
1 rok temu |
Johannes Gäßler
|
cb42c29427
server: correct --threads documentation [no ci] (#7362)
|
1 rok temu |
Leon Knauer
|
9c4fdcbec8
[Server] Added --verbose option to README [no ci] (#7335)
|
1 rok temu |
Ryuei
|
27f65d6267
docs: Fix typo and update description for --embeddings flag (#7026)
|
1 rok temu |
Johan
|
911b3900dd
server : add_special option for tokenize endpoint (#7059)
|
1 rok temu |
Johannes Gäßler
|
af0a5b6163
server: fix incorrectly reported token probabilities (#7125)
|
1 rok temu |
Kyle Mistele
|
260b7c6529
server : update readme with undocumented options (#7013)
|
1 rok temu |
Olivier Chafik
|
b8a7a5a90f
build(cmake): simplify instructions (`cmake -B build && cmake --build build ...`) (#6964)
|
1 rok temu |
Olivier Chafik
|
ab9a3240a9
JSON schema conversion: ⚡️ faster repetitions, min/maxLength for strings, cap number length (#6555)
|
1 rok temu |
Jan Boon
|
beea6e1b16
llama : save and restore kv cache for single seq id (#6341)
|
1 rok temu |
Georgi Gerganov
|
4399f13fb9
server : remove obsolete --memory-f32 option
|
1 rok temu |
Fattire
|
5fb1574c81
A few small fixes to server's README docs (#6428)
|
1 rok temu |
slaren
|
280345968d
cuda : rename build flag to LLAMA_CUDA (#6299)
|
1 rok temu |
Xuan Son Nguyen
|
ad3a0505e3
Server: clean up OAI params parsing function (#6284)
|
1 rok temu |
Pierrick Hymbert
|
f482bb2e49
common: llama_load_model_from_url split support (#6192)
|
1 rok temu |
Pierrick Hymbert
|
1997577d5e
server: docs: `--threads` and `--threads`, `--ubatch-size`, `--log-disable` (#6254)
|
1 rok temu |
Jan Boon
|
be07a03217
server : update readme doc from `slot_id` to `id_slot` (#6213)
|
1 rok temu |
Pierrick Hymbert
|
d01b3c4c32
common: llama_load_model_from_url using --model-url (#6098)
|
1 rok temu |
Jakub N
|
828defefb6
Update server docker image URLs (#5997)
|
1 rok temu |
Xuan Son Nguyen
|
caa106d4e0
Server: format error to json (#5961)
|
1 rok temu |
Georgi Gerganov
|
97c09585d6
server : clarify some items in the readme (#5957)
|
1 rok temu |
Xuan Son Nguyen
|
950ba1ab84
Server: reorganize some http logic (#5939)
|
1 rok temu |
Gabe Goodhart
|
e1fa9569ba
server : add SSL support (#5926)
|
1 rok temu |
Georgi Gerganov
|
2002bc96bf
server : refactor (#5882)
|
1 rok temu |
Pierrick Hymbert
|
8ef969afce
server : init http requests thread pool with --parallel if set (#5836)
|
1 rok temu |