cturan/llama.cpp

Autor	SHA1 Mensaxe	Data
Georgi Gerganov	b0f27361f3 sampling : avoid expensive softmax during greedy sampling (#9605)	hai 1 ano
Michael Podvitskiy	37f3a3810e llama : add llama_n_head() (#9512)	hai 1 ano
Georgi Gerganov	0abc6a2c25 llama : llama_perf + option to disable timings during decode (#9355)	hai 1 ano
Gilad S.	bd35cb0ae3 feat: remove a sampler from a chain (#9445)	hai 1 ano
slaren	49006c67b4 llama : move random seed generation to the samplers (#9398)	hai 1 ano
slaren	5fb5e24811 llama : minor sampling refactor (2) (#9386)	hai 1 ano
Georgi Gerganov	df270ef745 llama : refactor sampling v2 (#9294)	hai 1 ano
compilade	9bc6db28d0 ggml-quants : ternary packing for TriLMs and BitNet b1.58 (#8151)	hai 1 ano
Molly Sophia	8f1d81a0b6 llama : support RWKV v6 models (#8980)	hai 1 ano
Sutou Kouhei	0ab30f8d82 llama : fix llama_split_mode enum values in main_gpu document (#9057)	hai 1 ano
Faisal Zaghloul	42c76d1358 Threadpool: take 2 (#8672)	hai 1 ano
compilade	a1631e53f6 llama : simplify Mamba with advanced batch splits (#8526)	hai 1 ano
Minsoo Cheong	c679e0cb5c llama : add EXAONE model support (#9025)	hai 1 ano
Zhenwei Jin	4af8420afb common : remove duplicate function llama_should_add_bos_token (#8778)	hai 1 ano
Esko Toivonen	6bda7ce6c3 llama : add pre-tokenizer regexes for BLOOM and gpt3-finnish (#8850)	hai 1 ano
Daniel Bevenius	06943a69f6 ggml : move rope type enum to ggml.h (#8949)	hai 1 ano
fairydreaming	7c3f55c100 Add support for encoder-only T5 models (#8900)	hai 1 ano
Nexes the Old	31958546c3 typo correction (#8891)	hai 1 ano
compilade	4c676c85e5 llama : refactor session file management (#8699)	hai 1 ano
Xuan Son Nguyen	b115105f05 add llama_lora_adapter_clear (#8653)	hai 1 ano
Georgi Gerganov	938943cdbf llama : move vocab, grammar and sampling into separate files (#8508)	hai 1 ano
Keke Han	081fe431aa llama : fix codeshell support (#8599)	hai 1 ano
Jason Stillerman	d94c6e0ccb llama : add support for SmolLm pre-tokenizer (#8609)	hai 1 ano
Michael Coppola	940362224d llama : add support for Tekken pre-tokenizer (#8579)	hai 1 ano
Georgi Gerganov	d197545530 llama : bump max layers from 256 to 512 (#8530)	hai 1 ano
Georgi Gerganov	0efec57787 llama : valign + remove unused ftype (#8502)	hai 1 ano
Xuan Son Nguyen	97bdd26eee Refactor lora adapter support (#8332)	hai 1 ano
Dibakar Gope	0f1a39f343 ggml : add AArch64 optimized GEMV and GEMM Q4 kernels (#5780)	hai 1 ano
toyer	905942abdb llama : support glm3 and glm4 (#8031)	hai 1 ano
jaime-m-p	213701b51a Detokenizer fixes (#8039)	hai 1 ano

Posterior Anterior

Commit History Buscar

Commit History