Video Object Segmentation with Language Referring Expressions
2018·Arxiv