MEGABYTE: Predicting Million-byte Sequences with Multiscale Transformers | Read Paper on Bytez