Diffusion Beats Autoregressive in Data-Constrained Settings | Read Paper on Bytez