Fine-Grained Captioning of Long Videos through Scene Graph Consolidation | Read Paper on Bytez