CLIP is Almost All You Need: Towards Parameter-Efficient Scene Text Retrieval without OCR | Read Paper on Bytez