The Implicit Bias of Gradient Descent on Separable Data
2017·Arxiv