cturan/llama.cpp

Autor	SHA1 Mensaxe	Data
SeungWon Jeong	fb215c3832 server : normalize embeddings (#5956)	hai 1 ano
Minsoo Cheong	6d341ab6c5 speculative : implement stochastic speculative sampling (#5625)	hai 1 ano
DAN™	82f3e668ad common : use LLAMA_DEFAULT_SEED (#5855)	hai 1 ano
Douglas Hanley	475df1d6cf llama : allow for user specified embedding pooling type (#5849)	hai 1 ano
Pierrick Hymbert	3ab8b3a92e llama : cleanup unused mmq flags (#5772)	hai 1 ano
Georgi Gerganov	9d533a77d0 llama : fix defrag bugs + add parameter (#5735)	hai 1 ano
Georgi Gerganov	ab336a9d5e code : normalize enum names (#5697)	hai 1 ano
Alexey Parfenov	6dcc02d244 server : add "samplers" param to control the samplers order (#5494)	hai 1 ano
bmwl	f486f6e1e5 ggml : add numa options (#5377)	hai 1 ano
Alexey Parfenov	a803333a4e common : use enums for sampler types (#5418)	hai 1 ano
Jared Van Bortel	1ec3332ade YaRN : store rope scaling type as int32_t in memory (#5285)	hai 1 ano
Georgi Gerganov	5cb04dbc16 llama : remove LLAMA_MAX_DEVICES and LLAMA_SUPPORTS_GPU_OFFLOAD (#5240)	%!s(int64=2) %!d(string=hai) anos
Kawrakow	6f9939d119 KL-divergence (#5076)	%!s(int64=2) %!d(string=hai) anos
Kawrakow	7dcbe39d36 Add ability to evauate multiple choice tasks (#5047)	%!s(int64=2) %!d(string=hai) anos
Kawrakow	682986a08e Add Winogrande evaluation (#5015)	%!s(int64=2) %!d(string=hai) anos
stduhpf	e0324285a5 speculative : threading options (#4959)	%!s(int64=2) %!d(string=hai) anos
Yann Follet	722d33f34e main : add parameter --no-display-prompt (#4541)	%!s(int64=2) %!d(string=hai) anos
slaren	e7e4df031b llama : ggml-backend integration (#4766)	%!s(int64=2) %!d(string=hai) anos
Georgi Gerganov	7edefbd79c main : better name for variable n_print (#4874)	%!s(int64=2) %!d(string=hai) anos
Georgi Gerganov	3ca63b4538 main : disable token count by default (#4874)	%!s(int64=2) %!d(string=hai) anos
pudepiedj	43f76bf1c3 main : print total token count and tokens consumed so far (#4874)	%!s(int64=2) %!d(string=hai) anos
Georgi Gerganov	52531fdff8 main : add self-extend support (#4815)	%!s(int64=2) %!d(string=hai) anos
LeonEricsson	7082d24cec lookup : add prompt lookup decoding example (#4484)	%!s(int64=2) %!d(string=hai) anos
Georgi Gerganov	bcc0eb4591 llama : per-layer KV cache + quantum K cache (#4309)	%!s(int64=2) %!d(string=hai) anos
Kerfuffle	5aa365d88f llama : allow overriding GGUF metadata when loading model (#4092)	%!s(int64=2) %!d(string=hai) anos
MaggotHATE	52c8bc3cf3 sampling : custom samplers order (#4285)	%!s(int64=2) %!d(string=hai) anos
Georgi Gerganov	6b0a7420d0 llama : KV cache view API + better KV cache management (#4170)	%!s(int64=2) %!d(string=hai) anos
Seb C	881800d1f0 main : Add ChatML functionality to main example (#4046)	%!s(int64=2) %!d(string=hai) anos
Kerfuffle	91f6499393 Respect tokenizer.ggml.add_bos_token value when tokenizing (#4040)	%!s(int64=2) %!d(string=hai) anos
Georgi Gerganov	8f961abdc4 speculative : change default p_accept to 0.5 + CLI args (#3919)	%!s(int64=2) %!d(string=hai) anos

Posterior Anterior

Commit History Buscar

Commit History