Stronger Convergence Results for Deep Residual Networks: Network Width Scales Linearly with Training Data Size | Read Paper on Bytez