Gradient Descent can Learn Less Over-parameterized Two-layer Neural Networks on Classification Problems | Read Paper on Bytez