Linear $Q$-Learning Does Not Diverge in $L^2$: Convergence Rates to a Bounded Set | Read Paper on Bytez