Cross-modal Causal Relation Alignment for Video Question Grounding | Read Paper on Bytez