cturan/llama.cpp

作者	SHA1 メッセージ	日付
Piotr Wilkin	5f5e30007c Dilution n_seqs -> 1	3 ヶ月前
Piotr Wilkin	eb0a15fc9b n_tokens -> n_seq_tokens	3 ヶ月前
Piotr Wilkin	ee52fe36f3 Modify sanity check to handle hybrid models	3 ヶ月前
Piotr Wilkin	0dd6110fdc v1.0	3 ヶ月前
Piotr Wilkin	adcbd9428f Linear layer output convergence	3 ヶ月前
Piotr Wilkin	666fc0583d Parity on delta!	3 ヶ月前
Piotr Wilkin	a2c7b6794e Proper handling for n_tokens > GGML_DELTA_NET_CHUNK	3 ヶ月前
Piotr Wilkin	c1e46f62fa Achieve pre-chunk-attention parity; remove most of the LLM generated crap	3 ヶ月前
Piotr Wilkin	c87e8d550c Tensor preparation for delta_net complete	3 ヶ月前
Piotr Wilkin	7ec2df64a4 Added: tri, cumsum. Still a mess.	3 ヶ月前
Piotr Wilkin	6d0ad37cf4 Fix QKV extraction post-convolution	4 ヶ月前
Piotr Wilkin	845a3d7166 Convolution	4 ヶ月前
Piotr Wilkin	638057a29b Transpose input for convolution	4 ヶ月前
Piotr Wilkin	835d389fc5 Fix BA views as well	4 ヶ月前
Piotr Wilkin	594c1f98ef QKV splits done right	4 ヶ月前
Piotr Wilkin	dbd4d97cf2 Fix cb calls	4 ヶ月前
Piotr Wilkin	32dcee47ef Some attempts to get the convolution input right.	4 ヶ月前
Piotr Wilkin	397cd9fd67 Fix whitespace / end-of-line newline issues.	4 ヶ月前
Piotr Wilkin	5a8463f4a6 Add missing LFM2 code	4 ヶ月前
Piotr Wilkin	64de434118 Fixes from main branch	4 ヶ月前
Piotr Wilkin	7bedf4c66c Refactor llama-model.cpp	4 ヶ月前
Piotr Wilkin	9014feadfa Change RoPE to NeoX	4 ヶ月前
Piotr Wilkin	f020baa466 Normal attention: apply gate before output	4 ヶ月前
Piotr Wilkin	27fa5f335d Correct convolution state dimension calculations	4 ヶ月前
Piotr Wilkin	e24c9dfa60 Remove OP_DELTA_NET, fix flake8 and editorchecker because why not	4 ヶ月前
Piotr Wilkin	6e3abeb6c0 Exclude MTP layers in conversion	4 ヶ月前
Piotr Wilkin	43eb7a7757 Now that eval's running move delta net stuff back to llama-model, add cbs	4 ヶ月前
Piotr Wilkin	890fa2c1e3 WE HAVE OUTPUT!	4 ヶ月前
Piotr Wilkin	e590a75905 Cleanup complete, now for the recurrent memory management...	4 ヶ月前
Piotr Wilkin	2b0673c315 Cleanup ggml_delta_net	4 ヶ月前

新しい古い

コミット履歴 検索

コミット履歴