Make $\ell_1$ Regularization Effective in Training Sparse CNN | Read Paper on Bytez