cturan/llama.cpp

Автор	SHA1 Сообщение	Дата
Justine Tunney	65e5f6dadb Fix OpenAI server sampling w.r.t. temp and seed (#4668)	2 лет назад
Alexey Parfenov	6123979952 server : allow to specify custom prompt for penalty calculation (#3727)	2 лет назад
olexiyb	0ffc92d2d2 server : disable llm logs if SERVER_VERBOSE is off (#3792)	2 лет назад
AdithyanI	8edd2b40fd server : fix grammar being ignored (#4494)	2 лет назад
Alexey Parfenov	eb16dae7e7 server : fix possible ambiguity in content type charset (#4501)	2 лет назад
mzcu	62bd52b7bf server : allow requests larger than 8K (#4500)	2 лет назад
ShadovvBeast	88ae8952b6 server : add optional API Key Authentication example (#4441)	2 лет назад
shibe2	948ff137ec server : fix handling of characters that span multiple tokens when streaming (#4446)	2 лет назад
Vladimir Zorin	d9d4cfef64 server : fix local model name in server (#4420)	2 лет назад
Georgi Gerganov	bcc0eb4591 llama : per-layer KV cache + quantum K cache (#4309)	2 лет назад
Georgi Gerganov	05cd6e5036 server : recognize cache_prompt parameter in OAI API (#4347)	2 лет назад
Ed Lee	33e171d1e9 server : fix OpenAI API `stop` field to be optional (#4299)	2 лет назад
Georgi Gerganov	d5a1cbde60 llama : support optional tensors (#4283)	2 лет назад
Ziad Ben Hadj-Alouane	1d144112c0 server : add --log-disable to disable logging to file (#4260)	2 лет назад
Ziad Ben Hadj-Alouane	f43f09366d server : add single-client multi-prompt support (#4232)	2 лет назад
Georgi Gerganov	af19d35734 server : OAI API compatibility (#4198)	2 лет назад
Haohui Mai	55978ce09b Fix incorrect format strings and uninitialized variables. (#4133)	2 лет назад
SoftwareRenderer	936c79b227 server : relay error messages (#4131)	2 лет назад
Kerfuffle	91f6499393 Respect tokenizer.ggml.add_bos_token value when tokenizing (#4040)	2 лет назад
Alexey Parfenov	d96ca7ded7 server : fix crash when prompt exceeds context size (#3996)	2 лет назад
Mihai	57ad015dc3 server : add min_p param (#3877)	2 лет назад
cebtenzzre	b12fa0d1c1 build : link against build info instead of compiling against it (#3879)	2 лет назад
cebtenzzre	898aeca90a llama : implement YaRN RoPE scaling (#2268)	2 лет назад
Adrian Hesketh	ca190bca8e server : re-enable completion and embedded at the same time (#3876)	2 лет назад
Kerfuffle	6e08281e58 Extend llama_kv_cache_seq_rm to allow matching any sequence (#3843)	2 лет назад
Georgi Gerganov	34b2a5e1ee server : do not release slot on image input (#3798)	2 лет назад
cebtenzzre	ad93962657 server : add parameter -tb N, --threads-batch N (#3584) (#3768)	2 лет назад
Georgi Gerganov	1717521cdb server : do not block system prompt update (#3767)	2 лет назад
Marcus Dunn	5be6c803fa llama : remove token functions with `context` args in favor of `model` (#3720)	2 лет назад
Georgi Gerganov	438c2ca830 server : parallel decoding and multimodal (#3677)	2 лет назад

Новее Раньше

История коммитов Найти

История коммитов