Disentangling trainability and generalization in deep learning | Read Paper on Bytez