cturan/llama.cpp

Autor	SHA1 Zpráva	Datum
Kamil Tomšík	b906596bb7 Add Ava in the list of llama.cpp UIs (#4362)	před 1 rokem
Johannes Gäßler	aa7ab99be2 CUDA: fixed mmvq kernel for bs 2,3,4 and -sm row (#5386)	před 1 rokem
Neo Zhang Jianyu	10afa6f1d1 [SYCL] update install make by w64devkit (#5297)	před 1 rokem
Xiao-Yong Jin	0ef46da632 llava-cli : always tokenize special tokens (#5382)	před 1 rokem
0cc4m	ee1628bdfe Basic Vulkan Multi-GPU implementation (#5321)	před 1 rokem
Eve	ed0bf32290 readme : modernize (#5379)	před 1 rokem
Ben Williams	9a697d842b readme : update ui list (#5354)	před 1 rokem
runfuture	316c7faf77 llama : add MiniCPM support (#5346)	před 1 rokem
Justin Parker	f3e2b4fa3f server : update `/props` with "total_slots" value (#5373)	před 1 rokem
Sang-Kil Park	f68664ac24 convert : fix TypeError on GPT-2 vocab.json (#5288)	před 1 rokem
Alexey Parfenov	213d1439fa server : remove model.json endpoint (#5371)	před 1 rokem
Johannes Gäßler	17c97fb062 CUDA: mul_mat_vec_q max. batch size 8 -> 4 (#5370)	před 1 rokem
Kawrakow	b08f22c882 Update README.md (#5366)	před 1 rokem
Kawrakow	f57fadc009 Slight quantization improvement for Q4_K and Q5_K (#5361)	před 1 rokem
BarfingLemurs	2e9c0bd6b3 readme : add phi, orion 14b, internlm2, and yi-VL to readme (#5362)	před 1 rokem
Johannes Gäßler	2c516611f1 CUDA: mul_mat_vec_q for batch sizes > 1 (#5351)	před 1 rokem
Justin Parker	8a79c591de server : include total "num_slots" in props endpoint (#5349)	před 1 rokem
Michael Coppola	31e7903221 server : add `dynatemp_range` and `dynatemp_exponent` (#5352)	před 1 rokem
Niall Coates	4ffc7a17d4 server : various fixes for the prompt field in /completion (#5300)	před 1 rokem
Georgi Gerganov	906cff55c2 py : handle byte tokens in `get_token_type` (#5341)	před 1 rokem
Johannes Gäßler	098f6d737b make: Use ccache for faster compilation (#5318)	před 1 rokem
Johannes Gäßler	78b00dda6c README: updated introduction (#5343)	před 1 rokem
Kawrakow	c6b395535a ggml : make use of ggml-quants.h possible in C++ code (#5338)	před 1 rokem
Dr. Tom Murphy VII Ph.D	abb61944a5 ggml : avoid duplicating function calls using MIN/MAX macros (#5325)	před 1 rokem
Kawrakow	89503dcb5f iq3_xxs: quards for the no-imatrix situation (#5334)	před 1 rokem
Guoteng	7e1ae372f3 py : fix internlm2-hf convert to gguf (#5305)	před 1 rokem
Kawrakow	6fdfa2ecc6 iq2_xxs: tune quantization (#5320)	před 1 rokem
Alexey Parfenov	a2d60c9158 server : allow to get default generation settings for completion (#5307)	před 1 rokem
l3utterfly	e6f8177532 common : add dynamic temperature parameters to main example cli (#5295)	před 1 rokem
Georgi Gerganov	30679d438d scripts : fix typos, cleanup (#5303)	před 1 rokem

Novější Starší

Historie revizí Hledat

Historie revizí