White-Box Transformers via Sparse Rate Reduction | Read Paper on Bytez