Regret Bounds for Reinforcement Learning via Markov Chain Concentration | Read Paper on Bytez