Polos: Multimodal Metric Learning from Human Feedback for Image Captioning

Devs

Polos: Multimodal Metric Learning from Human Feedback for Image Captioning | Read Paper on Bytez