cturan/llama.cpp

Autor	SHA1 Mensagem	Data
Daniel Bevenius	433def286e llama : rename ctx to user_data in progress_callback (#7045)	1 ano atrás
Georgi Gerganov	9c67c2773d ggml : add Flash Attention (#5021)	1 ano atrás
Georgi Gerganov	f4ab2a4147 llama : fix BPE pre-tokenization (#6920)	1 ano atrás
Pierrick Hymbert	0c4d489e29 quantize: add imatrix and dataset metadata in GGUF (#6658)	1 ano atrás
slaren	017e6999b5 add basic tensor data validation function (#6884)	1 ano atrás
jiez	1966eb2615 quantize : add '--keep-split' to quantize model into shards (#6688)	1 ano atrás
Douglas Hanley	b4e4b8a935 llama : add llama_get_pooling_type function (#6862)	1 ano atrás
Johannes Gäßler	28103f4832 Server: fix seed for multiple slots (#6835)	1 ano atrás
Georgi Gerganov	40f74e4d73 llama : add option to render special/control tokens (#6807)	1 ano atrás
Pedro Cuenca	b97bc3966e llama : support Llama 3 HF conversion (#6745)	1 ano atrás
Olivier Chafik	cbaadc9294 grammars: 1.5x faster inference w/ complex grammars (vector reserves / reuses) (#6609)	1 ano atrás
Jared Van Bortel	1b67731e18 BERT tokenizer fixes (#6498)	1 ano atrás
Rick G	e3c337d87c llama : support negative ith in llama_get_ API (#6519)	1 ano atrás
Jan Boon	beea6e1b16 llama : save and restore kv cache for single seq id (#6341)	1 ano atrás
Clint Herron	9b84ae1806 examples : add GBNF validator program (#5948)	1 ano atrás
Jared Van Bortel	be55134a53 convert : refactor vocab selection logic (#6355)	1 ano atrás
compilade	557410b8f0 llama : greatly reduce output buffer memory usage (#6122)	1 ano atrás
Kawrakow	55c1b2a3bb IQ1_M: 1.75 bpw quantization (#6302)	1 ano atrás
Kawrakow	d25b1c31b0 quantize : be able to override metadata by key (#6321)	1 ano atrás
Kawrakow	1d0331c12a quantize: options for output and token embedding tensors qtype (#6239)	1 ano atrás
Pierrick Hymbert	dba1af6129 llama_model_loader: support multiple split/shard GGUFs (#6187)	1 ano atrás
Theia Vogel	877b4d0c62 llama : add support for control vectors (#5970)	1 ano atrás
Michael Podvitskiy	69ff61397d llama : support models without vocabulary (#5798)	1 ano atrás
slaren	f30ea47a87 llama : add pipeline parallelism support (#6017)	1 ano atrás
Georgi Gerganov	05b06210c9 llama : more consistent names of count variables (#5994)	1 ano atrás
Georgi Gerganov	ee35600b90 llama : fix F16/F32 downcast + improve names (#5980)	1 ano atrás
DAN™	bcebd7dbf6 llama : add support for GritLM (#5959)	1 ano atrás
compilade	c2101a2e90 llama : support Mamba Selective State Space Models (#5328)	1 ano atrás
Georgi Gerganov	29ae62d2ae llama : fix embeddings (#5796)	1 ano atrás
Douglas Hanley	475df1d6cf llama : allow for user specified embedding pooling type (#5849)	1 ano atrás

Mais recente Mais Antigo

Histórico de commits Buscar

Histórico de commits