Knowledge-based Visual Question Answer with Multimodal Processing, Retrieval and Filtering | Read Paper on Bytez