Learning When to Concentrate or Divert Attention: Self-Adaptive Attention Temperature for Neural Machine Translation | Read Paper on Bytez