Revīziju vēsture

Autors SHA1 Ziņojums Datums
  Piotr Wilkin 0b301889bf Stabilize tensor dump trigger for now with -n < 50 3 mēneši atpakaļ
  Piotr Wilkin f0a07c1091 Add proper backend tensor printing, use double for accumulating the sum 3 mēneši atpakaļ
  Piotr Wilkin 4c8771d200 Print 5D tensors 3 mēneši atpakaļ
  Piotr Wilkin 10032affcf More debug data 3 mēneši atpakaļ
  Piotr Wilkin d300ce9eba Hmmmm...... 3 mēneši atpakaļ
  Piotr Wilkin 3f5994223b Hmm... 3 mēneši atpakaļ
  Piotr Wilkin 7348546b5e Missing cont() 3 mēneši atpakaļ
  Piotr Wilkin 5a161d9461 Remove unnecessary transposes/reshapes 3 mēneši atpakaļ
  Piotr Wilkin 572864287e Handle case with more than one token per seq with elegant loop plus completely not crazy change to max nodes ;) 3 mēneši atpakaļ
  Piotr Wilkin c2a82a1773 Move the norm shift to conversion, Gemma 2 style 3 mēneši atpakaļ
  Piotr Wilkin 5306640300 All's well that ends in a well 3 mēneši atpakaļ
  Piotr Wilkin 232ec56251 Yes, I finally managed to implement it with ssm_conv :> 3 mēneši atpakaļ
  Piotr Wilkin aa8d6a21a3 Remove extra files cont. 3 mēneši atpakaļ
  Piotr Wilkin e9a98f2af9 Remove extra files 3 mēneši atpakaļ
  Piotr Wilkin 22ee5a971b Add gate_sigmoid to callback 3 mēneši atpakaļ
  Piotr Wilkin ce87b7d78e Yup, it's NeoX 3 mēneši atpakaļ
  Piotr Wilkin df0b5bcf30 Proper order of attention operations 3 mēneši atpakaļ
  Piotr Wilkin 54712b8664 Oh, forgot to commit 3 mēneši atpakaļ
  Piotr Wilkin 17240eafc0 Order stuff around 3 mēneši atpakaļ
  Piotr Wilkin 1579bcb202 What am I missing? :/ 3 mēneši atpakaļ
  Piotr Wilkin 0a9244acd0 The optimization worked even too well ;) 3 mēneši atpakaļ
  Piotr Wilkin 8ddaf251ae Fix some state regressions... still wip 3 mēneši atpakaļ
  Piotr Wilkin 6942c85cf8 Oh, actually set n_tasks as well :P 3 mēneši atpakaļ
  Piotr Wilkin 477c1616ad Parallelize delta_net 3 mēneši atpakaļ
  Piotr Wilkin 4ef6f337de Proper multi-sequence convolution calculation, corrected (?) state management 3 mēneši atpakaļ
  Piotr Wilkin 5f5e30007c Dilution n_seqs -> 1 3 mēneši atpakaļ
  Piotr Wilkin eb0a15fc9b n_tokens -> n_seq_tokens 3 mēneši atpakaļ
  Piotr Wilkin ee52fe36f3 Modify sanity check to handle hybrid models 3 mēneši atpakaļ
  Piotr Wilkin 0dd6110fdc v1.0 3 mēneši atpakaļ
  Piotr Wilkin adcbd9428f Linear layer output convergence 3 mēneši atpakaļ