Does syntax need to grow on trees? Sources of hierarchical inductive bias in sequence-to-sequence networks
2020·Arxiv