cturan/llama.cpp

Author	SHA1 Message	Date
SeungWon Jeong	fb215c3832 server : normalize embeddings (#5956)	1 year ago
Minsoo Cheong	6d341ab6c5 speculative : implement stochastic speculative sampling (#5625)	1 year ago
DAN™	82f3e668ad common : use LLAMA_DEFAULT_SEED (#5855)	1 year ago
Douglas Hanley	475df1d6cf llama : allow for user specified embedding pooling type (#5849)	1 year ago
Pierrick Hymbert	3ab8b3a92e llama : cleanup unused mmq flags (#5772)	1 year ago
Georgi Gerganov	9d533a77d0 llama : fix defrag bugs + add parameter (#5735)	1 year ago
Georgi Gerganov	ab336a9d5e code : normalize enum names (#5697)	1 year ago
Alexey Parfenov	6dcc02d244 server : add "samplers" param to control the samplers order (#5494)	1 year ago
bmwl	f486f6e1e5 ggml : add numa options (#5377)	1 year ago
Alexey Parfenov	a803333a4e common : use enums for sampler types (#5418)	1 year ago
Jared Van Bortel	1ec3332ade YaRN : store rope scaling type as int32_t in memory (#5285)	1 year ago
Georgi Gerganov	5cb04dbc16 llama : remove LLAMA_MAX_DEVICES and LLAMA_SUPPORTS_GPU_OFFLOAD (#5240)	2 years ago
Kawrakow	6f9939d119 KL-divergence (#5076)	2 years ago
Kawrakow	7dcbe39d36 Add ability to evauate multiple choice tasks (#5047)	2 years ago
Kawrakow	682986a08e Add Winogrande evaluation (#5015)	2 years ago
stduhpf	e0324285a5 speculative : threading options (#4959)	2 years ago
Yann Follet	722d33f34e main : add parameter --no-display-prompt (#4541)	2 years ago
slaren	e7e4df031b llama : ggml-backend integration (#4766)	2 years ago
Georgi Gerganov	7edefbd79c main : better name for variable n_print (#4874)	2 years ago
Georgi Gerganov	3ca63b4538 main : disable token count by default (#4874)	2 years ago
pudepiedj	43f76bf1c3 main : print total token count and tokens consumed so far (#4874)	2 years ago
Georgi Gerganov	52531fdff8 main : add self-extend support (#4815)	2 years ago
LeonEricsson	7082d24cec lookup : add prompt lookup decoding example (#4484)	2 years ago
Georgi Gerganov	bcc0eb4591 llama : per-layer KV cache + quantum K cache (#4309)	2 years ago
Kerfuffle	5aa365d88f llama : allow overriding GGUF metadata when loading model (#4092)	2 years ago
MaggotHATE	52c8bc3cf3 sampling : custom samplers order (#4285)	2 years ago
Georgi Gerganov	6b0a7420d0 llama : KV cache view API + better KV cache management (#4170)	2 years ago
Seb C	881800d1f0 main : Add ChatML functionality to main example (#4046)	2 years ago
Kerfuffle	91f6499393 Respect tokenizer.ggml.add_bos_token value when tokenizing (#4040)	2 years ago
Georgi Gerganov	8f961abdc4 speculative : change default p_accept to 0.5 + CLI args (#3919)	2 years ago

Newer Older

Commit History Find

Commit History