R2D2: Recursive Transformer based on Differentiable Tree for Interpretable Hierarchical Language Modeling

Authors :: Yi Su
Jing Zheng
Wen Zujie
Gerard de Melo
Xiang Hu
Haitao Mi
Yafang Wang
Source :: ACL/IJCNLP (1)
Publication Year :: 2021
Publisher :: arXiv, 2021.
Abstract: Human language understanding operates at multiple levels of granularity (e.g., words, phrases, and sentences) with increasing levels of abstraction that can be hierarchically combined. However, existing deep models with stacked layers do not explicitly model any sort of hierarchical process. This paper proposes a recursive Transformer model based on differentiable CKY style binary trees to emulate the composition process. We extend the bidirectional language model pre-training objective to this architecture, attempting to predict each word given its left and right abstraction nodes. To scale up our approach, we also introduce an efficient pruned tree induction algorithm to enable encoding in just a linear number of composition steps. Experimental results on language modeling and unsupervised parsing show the effectiveness of our approach.<br />Comment: ACL-IJCNLP 2021

Subjects :: FOS: Computer and information sciences
Computer Science - Machine Learning
Computer Science - Computation and Language
Theoretical computer science
Parsing
Binary tree
Computer science
computer.software_genre
Machine Learning (cs.LG)
Tree (data structure)
Language model
Time complexity
computer
Computation and Language (cs.CL)
Word (computer architecture)
Transformer (machine learning model)
Abstraction (linguistics)

Full Text Access

Tools