AGD: an Auto-switchable Optimizer using Stepwise Gradient Difference for Preconditioning Matrix | Read Paper on Bytez