WenLan: Bridging Vision and Language by Large-Scale Multi-Modal Pre-Training | Read Paper on Bytez