DetCLIPv2: Scalable Open-Vocabulary Object Detection Pre-Training via Word-Region Alignment | Read Paper on Bytez