Towards Self-Refinement of Vision-Language Models with Triangular Consistency | Read Paper on Bytez