cturan/llama.cpp

Autor	SHA1 Mensaje	Fecha
CJ Pais	6560bed3f0 server : support llava 1.6 (#5553)	hace 1 año
Xuan Son Nguyen	9c405c9f9a Server: use llama_chat_apply_template (#5593)	hace 1 año
Pierrick Hymbert	c0a8c6db37 server : health endpoint configurable failure on no slot (#5594)	hace 1 año
Robey Holderith	5ee99c32f5 common, server : surface min_keep as its own parameter (#5567)	hace 1 año
Pierrick Hymbert	c145f8a132 server : slots monitoring endpoint (#5550)	hace 1 año
Pierrick Hymbert	e75c6279d1 server : enhanced health endpoint (#5548)	hace 1 año
Pierrick Hymbert	36376abe05 server : --n-predict option document and cap to max value (#5549)	hace 1 año
Daniel Hiltgen	66c1968f7a server : graceful server shutdown (#5244)	hace 1 año
Alexey Parfenov	6dcc02d244 server : add "samplers" param to control the samplers order (#5494)	hace 1 año
Rőczey Barnabás	5f5808ca7b server : fix system prompt cli (#5516)	hace 1 año
bmwl	f486f6e1e5 ggml : add numa options (#5377)	hace 1 año
Elbios	0d4177126b llava : fix memory management bug (#5491)	hace 1 año
John	aa23412989 llava : support v1.6 (#5267)	hace 1 año
Alexey Parfenov	684780141a server : allow to specify tokens as strings in logit_bias (#5003)	hace 1 año
Xuan Son Nguyen	907e08c110 server : add llama2 chat template (#5425)	hace 1 año
Riley Stewart	7c777fcd5d server : fix prompt caching for repeated prompts (#5420)	hace 1 año
Justin Parker	f3e2b4fa3f server : update `/props` with "total_slots" value (#5373)	hace 1 año
Alexey Parfenov	213d1439fa server : remove model.json endpoint (#5371)	hace 1 año
Justin Parker	8a79c591de server : include total "num_slots" in props endpoint (#5349)	hace 1 año
Michael Coppola	31e7903221 server : add `dynatemp_range` and `dynatemp_exponent` (#5352)	hace 1 año
Niall Coates	4ffc7a17d4 server : various fixes for the prompt field in /completion (#5300)	hace 1 año
Alexey Parfenov	a2d60c9158 server : allow to get default generation settings for completion (#5307)	hace 1 año
Michael Klimenko	52bb63c708 refactor : switch to emplace_back to avoid extra object (#5291)	hace 1 año
Georgi Gerganov	5cb04dbc16 llama : remove LLAMA_MAX_DEVICES and LLAMA_SUPPORTS_GPU_OFFLOAD (#5240)	hace 1 año
Georgi Gerganov	e6f291d158 server : fix context shift (#5195)	hace 1 año
Wu Jian Ping	c82d18e863 server : embeddings compatibility for OpenAI (#5190)	hace 2 años
Abhilash Majumder	0f648573dd ggml : add unified SYCL backend for Intel GPUs (#2690)	hace 2 años
Michael Klimenko	35a2ee9143 Remove unused data and add fixes (#5154)	hace 2 años
Maximilian Winter	ec903c0341 server : add self-extend support (#5104)	hace 2 años
Xuan Son Nguyen	48c857aa10 server : refactored the task processing logic (#5065)	hace 2 años

Posterior Anterior

Historial de Commits Buscar

Historial de Commits