Adaptive learning rates and parallelization for stochastic, sparse, non-smooth gradients

Devs

Adaptive learning rates and parallelization for stochastic, sparse, non-smooth gradients | Read Paper on Bytez