$Q$- and $A$-Learning Methods for Estimating Optimal Dynamic Treatment Regimes | Read Paper on Bytez