From Pixels to Graphs: Open-Vocabulary Scene Graph Generation with Vision-Language Models

Devs

From Pixels to Graphs: Open-Vocabulary Scene Graph Generation with Vision-Language Models | Read Paper on Bytez