cturan/llama.cpp

作者	SHA1 備註	提交日期
Piotr Wilkin	5306640300 All's well that ends in a well	3 月之前
Piotr Wilkin	232ec56251 Yes, I finally managed to implement it with ssm_conv :>	3 月之前
Piotr Wilkin	aa8d6a21a3 Remove extra files cont.	3 月之前
Piotr Wilkin	e9a98f2af9 Remove extra files	3 月之前
Piotr Wilkin	22ee5a971b Add gate_sigmoid to callback	3 月之前
Piotr Wilkin	ce87b7d78e Yup, it's NeoX	3 月之前
Piotr Wilkin	df0b5bcf30 Proper order of attention operations	3 月之前
Piotr Wilkin	54712b8664 Oh, forgot to commit	3 月之前
Piotr Wilkin	17240eafc0 Order stuff around	3 月之前
Piotr Wilkin	1579bcb202 What am I missing? :/	3 月之前
Piotr Wilkin	0a9244acd0 The optimization worked even too well ;)	4 月之前
Piotr Wilkin	8ddaf251ae Fix some state regressions... still wip	4 月之前
Piotr Wilkin	6942c85cf8 Oh, actually set n_tasks as well :P	4 月之前
Piotr Wilkin	477c1616ad Parallelize delta_net	4 月之前
Piotr Wilkin	4ef6f337de Proper multi-sequence convolution calculation, corrected (?) state management	4 月之前
Piotr Wilkin	5f5e30007c Dilution n_seqs -> 1	4 月之前
Piotr Wilkin	eb0a15fc9b n_tokens -> n_seq_tokens	4 月之前
Piotr Wilkin	ee52fe36f3 Modify sanity check to handle hybrid models	4 月之前
Piotr Wilkin	0dd6110fdc v1.0	4 月之前
Piotr Wilkin	adcbd9428f Linear layer output convergence	4 月之前
Piotr Wilkin	666fc0583d Parity on delta!	4 月之前
Piotr Wilkin	a2c7b6794e Proper handling for n_tokens > GGML_DELTA_NET_CHUNK	4 月之前
Piotr Wilkin	c1e46f62fa Achieve pre-chunk-attention parity; remove most of the LLM generated crap	4 月之前
Piotr Wilkin	c87e8d550c Tensor preparation for delta_net complete	4 月之前
Piotr Wilkin	7ec2df64a4 Added: tri, cumsum. Still a mess.	4 月之前
Piotr Wilkin	6d0ad37cf4 Fix QKV extraction post-convolution	4 月之前
Piotr Wilkin	845a3d7166 Convolution	4 月之前
Piotr Wilkin	638057a29b Transpose input for convolution	4 月之前
Piotr Wilkin	835d389fc5 Fix BA views as well	4 月之前
Piotr Wilkin	594c1f98ef QKV splits done right	4 月之前

更新的提交更舊的提交

Commit History 查找

Commit History