Piotr Wilkin
|
16b3f9c300
Valgrind debugging session / multi-chunk support
|
hai 3 meses |
Piotr Wilkin
|
5417f3294b
Wrong dimension order
|
hai 3 meses |
Piotr Wilkin
|
e5ffc91d0a
Fix wrong shape for K norm
|
hai 3 meses |
Piotr Wilkin
|
875de2bcc2
e steps forward, pi steps back
|
hai 3 meses |
Piotr Wilkin
|
a60458ebee
Remove more debug
|
hai 3 meses |
Piotr Wilkin
|
78e0fbd8f4
Remove debug
|
hai 3 meses |
Piotr Wilkin
|
9de7244c26
Fix memory corruption
|
hai 3 meses |
Piotr Wilkin
|
75586ea36e
Delta.net chunked reimplemented
|
hai 3 meses |
Piotr Wilkin
|
61fbeef88b
Attempt 246461
|
hai 3 meses |
Piotr Wilkin
|
a51d4381d4
Like this.
|
hai 3 meses |
Piotr Wilkin
|
54bb6f1eb9
argh again
|
hai 3 meses |
Piotr Wilkin
|
20424d8785
argh
|
hai 3 meses |
Piotr Wilkin
|
413652178f
attempt 2
|
hai 3 meses |
Piotr Wilkin
|
c5dc442a5d
repeat_interleave
|
hai 3 meses |
Piotr Wilkin
|
a4fe12821b
Fix layer counting logic
|
hai 3 meses |
Piotr Wilkin
|
610b0fede7
Wrong tensor for comparison
|
hai 3 meses |
Piotr Wilkin
|
4d571eda07
Let's dump extra tensors
|
hai 3 meses |
Piotr Wilkin
|
2cab86a09f
Let the debug out.
|
hai 3 meses |
Piotr Wilkin
|
7eef0bd948
Rewrite recurrent delta + softmax to separate ops
|
hai 3 meses |
Piotr Wilkin
|
554593d60d
Variable scopes are fun
|
hai 3 meses |
Piotr Wilkin
|
0b301889bf
Stabilize tensor dump trigger for now with -n < 50
|
hai 3 meses |
Piotr Wilkin
|
f0a07c1091
Add proper backend tensor printing, use double for accumulating the sum
|
hai 3 meses |
Piotr Wilkin
|
4c8771d200
Print 5D tensors
|
hai 3 meses |
Piotr Wilkin
|
10032affcf
More debug data
|
hai 3 meses |
Piotr Wilkin
|
d300ce9eba
Hmmmm......
|
hai 3 meses |
Piotr Wilkin
|
3f5994223b
Hmm...
|
hai 3 meses |
Piotr Wilkin
|
7348546b5e
Missing cont()
|
hai 3 meses |
Piotr Wilkin
|
5a161d9461
Remove unnecessary transposes/reshapes
|
hai 3 meses |
Piotr Wilkin
|
572864287e
Handle case with more than one token per seq with elegant loop plus completely not crazy change to max nodes ;)
|
hai 3 meses |
Piotr Wilkin
|
c2a82a1773
Move the norm shift to conversion, Gemma 2 style
|
hai 3 meses |