Towards In-context Scene Understanding | Read Paper on Bytez