Back to Search Start Over

Content Modeling Using Latent Permutations

Authors :
Chen, Harr
Branavan, S. R. K.
Barzilay, Regina
Karger, David R.
Source :
Journal Of Artificial Intelligence Research, Volume 36, pages 129-163, 2009
Publication Year :
2014

Abstract

We present a novel Bayesian topic model for learning discourse-level document structure. Our model leverages insights from discourse theory to constrain latent topic assignments in a way that reflects the underlying organization of document topics. We propose a global model in which both topic selection and ordering are biased to be similar across a collection of related documents. We show that this space of orderings can be effectively represented using a distribution over permutations called the Generalized Mallows Model. We apply our method to three complementary discourse-level tasks: cross-document alignment, document segmentation, and information ordering. Our experiments show that incorporating our permutation-based model in these applications yields substantial improvements in performance over previously proposed methods.

Details

Database :
arXiv
Journal :
Journal Of Artificial Intelligence Research, Volume 36, pages 129-163, 2009
Publication Type :
Report
Accession number :
edsarx.1401.3488
Document Type :
Working Paper
Full Text :
https://doi.org/10.1613/jair.2830