cturan
|
bde1886208
fix: incorrect tensor variable used in reshape operations for k and v tensors
|
2 mesiacov pred |
cturan
|
2b40434af3
Enhance CUDA kernels for CUMSUM and DELTA_NET operations
|
2 mesiacov pred |
cturan
|
2e9f9bc889
Add CUDA kernels for TRI, CUMSUM, and DELTA_NET operations
|
2 mesiacov pred |
Piotr Wilkin
|
42d7d1b137
Make Mr. Chunky talk less.
|
2 mesiacov pred |
Piotr Wilkin
|
abcacd4cd2
Mr. Chunky likes pizza!
|
2 mesiacov pred |
Piotr Wilkin
|
db247cf48f
Mr Chunky is such fun!
|
2 mesiacov pred |
Piotr Wilkin
|
5d0a237a1c
More food for Mr. Chunky.
|
2 mesiacov pred |
Piotr Wilkin
|
6798b69bcc
Some updates for Mr. Chunky
|
2 mesiacov pred |
Piotr Wilkin
|
2fdbf16eb1
Add proper check for previous state
|
2 mesiacov pred |
Piotr Wilkin
|
912339a5c2
Proper (?) offsetting
|
2 mesiacov pred |
Piotr Wilkin
|
16b3f9c300
Valgrind debugging session / multi-chunk support
|
2 mesiacov pred |
Piotr Wilkin
|
5417f3294b
Wrong dimension order
|
3 mesiacov pred |
Piotr Wilkin
|
e5ffc91d0a
Fix wrong shape for K norm
|
3 mesiacov pred |
Piotr Wilkin
|
875de2bcc2
e steps forward, pi steps back
|
3 mesiacov pred |
Piotr Wilkin
|
a60458ebee
Remove more debug
|
3 mesiacov pred |
Piotr Wilkin
|
78e0fbd8f4
Remove debug
|
3 mesiacov pred |
Piotr Wilkin
|
9de7244c26
Fix memory corruption
|
3 mesiacov pred |
Piotr Wilkin
|
75586ea36e
Delta.net chunked reimplemented
|
3 mesiacov pred |
Piotr Wilkin
|
61fbeef88b
Attempt 246461
|
3 mesiacov pred |
Piotr Wilkin
|
a51d4381d4
Like this.
|
3 mesiacov pred |
Piotr Wilkin
|
54bb6f1eb9
argh again
|
3 mesiacov pred |
Piotr Wilkin
|
20424d8785
argh
|
3 mesiacov pred |
Piotr Wilkin
|
413652178f
attempt 2
|
3 mesiacov pred |
Piotr Wilkin
|
c5dc442a5d
repeat_interleave
|
3 mesiacov pred |
Piotr Wilkin
|
a4fe12821b
Fix layer counting logic
|
3 mesiacov pred |
Piotr Wilkin
|
610b0fede7
Wrong tensor for comparison
|
3 mesiacov pred |
Piotr Wilkin
|
4d571eda07
Let's dump extra tensors
|
3 mesiacov pred |
Piotr Wilkin
|
2cab86a09f
Let the debug out.
|
3 mesiacov pred |
Piotr Wilkin
|
7eef0bd948
Rewrite recurrent delta + softmax to separate ops
|
3 mesiacov pred |
Piotr Wilkin
|
554593d60d
Variable scopes are fun
|
3 mesiacov pred |