Ovis: Structural Embedding Alignment for Multimodal Large Language Model | Read Paper on Bytez