cturan/llama.cpp

Author	SHA1 Message	Date
Pierrick Hymbert	d01b3c4c32 common: llama_load_model_from_url using --model-url (#6098)	1 year ago
Theia Vogel	877b4d0c62 llama : add support for control vectors (#5970)	1 year ago
Georgi Gerganov	0fd6c1f015 embedding : print cosine similarity (#899)	1 year ago
slaren	f30ea47a87 llama : add pipeline parallelism support (#6017)	1 year ago
SeungWon Jeong	fb215c3832 server : normalize embeddings (#5956)	1 year ago
Minsoo Cheong	6d341ab6c5 speculative : implement stochastic speculative sampling (#5625)	1 year ago
DAN™	82f3e668ad common : use LLAMA_DEFAULT_SEED (#5855)	1 year ago
Douglas Hanley	475df1d6cf llama : allow for user specified embedding pooling type (#5849)	1 year ago
Pierrick Hymbert	3ab8b3a92e llama : cleanup unused mmq flags (#5772)	1 year ago
Georgi Gerganov	9d533a77d0 llama : fix defrag bugs + add parameter (#5735)	1 year ago
Georgi Gerganov	ab336a9d5e code : normalize enum names (#5697)	1 year ago
Alexey Parfenov	6dcc02d244 server : add "samplers" param to control the samplers order (#5494)	1 year ago
bmwl	f486f6e1e5 ggml : add numa options (#5377)	1 year ago
Alexey Parfenov	a803333a4e common : use enums for sampler types (#5418)	1 year ago
Jared Van Bortel	1ec3332ade YaRN : store rope scaling type as int32_t in memory (#5285)	1 year ago
Georgi Gerganov	5cb04dbc16 llama : remove LLAMA_MAX_DEVICES and LLAMA_SUPPORTS_GPU_OFFLOAD (#5240)	2 years ago
Kawrakow	6f9939d119 KL-divergence (#5076)	2 years ago
Kawrakow	7dcbe39d36 Add ability to evauate multiple choice tasks (#5047)	2 years ago
Kawrakow	682986a08e Add Winogrande evaluation (#5015)	2 years ago
stduhpf	e0324285a5 speculative : threading options (#4959)	2 years ago
Yann Follet	722d33f34e main : add parameter --no-display-prompt (#4541)	2 years ago
slaren	e7e4df031b llama : ggml-backend integration (#4766)	2 years ago
Georgi Gerganov	7edefbd79c main : better name for variable n_print (#4874)	2 years ago
Georgi Gerganov	3ca63b4538 main : disable token count by default (#4874)	2 years ago
pudepiedj	43f76bf1c3 main : print total token count and tokens consumed so far (#4874)	2 years ago
Georgi Gerganov	52531fdff8 main : add self-extend support (#4815)	2 years ago
LeonEricsson	7082d24cec lookup : add prompt lookup decoding example (#4484)	2 years ago
Georgi Gerganov	bcc0eb4591 llama : per-layer KV cache + quantum K cache (#4309)	2 years ago
Kerfuffle	5aa365d88f llama : allow overriding GGUF metadata when loading model (#4092)	2 years ago
MaggotHATE	52c8bc3cf3 sampling : custom samplers order (#4285)	2 years ago

Newer Older

Commit History Find

Commit History