SimVG: A Simple Framework for Visual Grounding with Decoupled Multi-modal Fusion | Read Paper on Bytez