VTQA: Visual Text Question Answering via Entity Alignment and Cross-Media Reasoning | Read Paper on Bytez