VisQA: X-raying Vision and Language Reasoning in Transformers | Read Paper on Bytez