SGD vs GD: Rank Deficiency in Linear Networks | Read Paper on Bytez