Mellotron: Multispeaker expressive voice synthesis by conditioning on rhythm, pitch and global style tokens | Read Paper on Bytez