Improved training of end-to-end attention models for speech recognition | Read Paper on Bytez