FG-CLIP: Fine-Grained Visual and Textual Alignment | Read Paper on Bytez