Your Large Vision-Language Model Only Needs A Few Attention Heads For Visual Grounding | Read Paper on Bytez