DenseFormer: Enhancing Information Flow in Transformers via Depth Weighted Averaging | Read Paper on Bytez