Multi-representation Ensembles and Delayed SGD Updates Improve Syntax-based NMT | Read Paper on Bytez