Do Neural Networks Need Gradient Descent to Generalize? A Theoretical Study | Read Paper on Bytez