Adaptive learning rates and parallelization for stochastic, sparse, non-smooth gradients | Read Paper on Bytez