Implicit Regularization of Stochastic Gradient Descent in Natural Language Processing: Observations and Implications | Read Paper on Bytez