Referring Expression Object Segmentation with Caption-Aware Consistency | Read Paper on Bytez