cturan/llama.cpp

Author	SHA1 Message	Date
Piotr Wilkin	0dd6110fdc v1.0	3 months ago
Piotr Wilkin	adcbd9428f Linear layer output convergence	3 months ago
Piotr Wilkin	666fc0583d Parity on delta!	3 months ago
Piotr Wilkin	a2c7b6794e Proper handling for n_tokens > GGML_DELTA_NET_CHUNK	3 months ago
Piotr Wilkin	c1e46f62fa Achieve pre-chunk-attention parity; remove most of the LLM generated crap	3 months ago
Piotr Wilkin	c87e8d550c Tensor preparation for delta_net complete	3 months ago
Piotr Wilkin	7ec2df64a4 Added: tri, cumsum. Still a mess.	3 months ago
Piotr Wilkin	6d0ad37cf4 Fix QKV extraction post-convolution	3 months ago
Piotr Wilkin	845a3d7166 Convolution	4 months ago
Piotr Wilkin	638057a29b Transpose input for convolution	4 months ago
Piotr Wilkin	835d389fc5 Fix BA views as well	4 months ago
Piotr Wilkin	594c1f98ef QKV splits done right	4 months ago
Piotr Wilkin	dbd4d97cf2 Fix cb calls	4 months ago
Piotr Wilkin	32dcee47ef Some attempts to get the convolution input right.	4 months ago
Piotr Wilkin	7bedf4c66c Refactor llama-model.cpp	4 months ago