UniT: Multimodal Multitask Learning with a Unified Transformer | Read Paper on Bytez