Piotr Wilkin
|
4ef6f337de
Proper multi-sequence convolution calculation, corrected (?) state management
|
преди 3 месеца |
Piotr Wilkin
|
5f5e30007c
Dilution n_seqs -> 1
|
преди 3 месеца |
Piotr Wilkin
|
eb0a15fc9b
n_tokens -> n_seq_tokens
|
преди 3 месеца |
Piotr Wilkin
|
ee52fe36f3
Modify sanity check to handle hybrid models
|
преди 3 месеца |
Piotr Wilkin
|
0dd6110fdc
v1.0
|
преди 3 месеца |
Piotr Wilkin
|
adcbd9428f
Linear layer output convergence
|
преди 3 месеца |
Piotr Wilkin
|
666fc0583d
Parity on delta!
|
преди 3 месеца |
Piotr Wilkin
|
a2c7b6794e
Proper handling for n_tokens > GGML_DELTA_NET_CHUNK
|
преди 3 месеца |
Piotr Wilkin
|
c1e46f62fa
Achieve pre-chunk-attention parity; remove most of the LLM generated crap
|
преди 3 месеца |
Piotr Wilkin
|
c87e8d550c
Tensor preparation for delta_net complete
|
преди 3 месеца |
Piotr Wilkin
|
7ec2df64a4
Added: tri, cumsum. Still a mess.
|
преди 3 месеца |
Piotr Wilkin
|
6d0ad37cf4
Fix QKV extraction post-convolution
|
преди 3 месеца |
Piotr Wilkin
|
845a3d7166
Convolution
|
преди 3 месеца |
Piotr Wilkin
|
638057a29b
Transpose input for convolution
|
преди 3 месеца |
Piotr Wilkin
|
835d389fc5
Fix BA views as well
|
преди 3 месеца |
Piotr Wilkin
|
594c1f98ef
QKV splits done right
|
преди 3 месеца |
Piotr Wilkin
|
dbd4d97cf2
Fix cb calls
|
преди 3 месеца |
Piotr Wilkin
|
32dcee47ef
Some attempts to get the convolution input right.
|
преди 3 месеца |
Piotr Wilkin
|
397cd9fd67
Fix whitespace / end-of-line newline issues.
|
преди 3 месеца |
Piotr Wilkin
|
5a8463f4a6
Add missing LFM2 code
|
преди 3 месеца |
Piotr Wilkin
|
64de434118
Fixes from main branch
|
преди 3 месеца |
Piotr Wilkin
|
7bedf4c66c
Refactor llama-model.cpp
|
преди 3 месеца |
Piotr Wilkin
|
9014feadfa
Change RoPE to NeoX
|
преди 3 месеца |
Piotr Wilkin
|
f020baa466
Normal attention: apply gate before output
|
преди 4 месеца |
Piotr Wilkin
|
27fa5f335d
Correct convolution state dimension calculations
|
преди 4 месеца |
Piotr Wilkin
|
e24c9dfa60
Remove OP_DELTA_NET, fix flake8 and editorchecker because why not
|
преди 4 месеца |
Piotr Wilkin
|
6e3abeb6c0
Exclude MTP layers in conversion
|
преди 4 месеца |
Piotr Wilkin
|
43eb7a7757
Now that eval's running move delta net stuff back to llama-model, add cbs
|
преди 4 месеца |
Piotr Wilkin
|
890fa2c1e3
WE HAVE OUTPUT!
|
преди 4 месеца |
Piotr Wilkin
|
e590a75905
Cleanup complete, now for the recurrent memory management...
|
преди 4 месеца |