1. Text segmentation with topic modeling and entity coherence
- Author
-
Adebayo Kolawole John, Guido Boella, Luigi Di Caro, Abraham, A, Haqiq, A, Alimi, AM, Mezzour, G, Rokbani, N, Muda, AK, Adebayo Kolawole, John, Di Caro, Luigi, and Boella, Guido
- Subjects
Topic model ,Boundary detection ,Text segmentation ,Entity coherence ,Computer science ,Boundary (topology) ,02 engineering and technology ,computer.software_genre ,Dirichlet distribution ,symbols.namesake ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,Lineardirichlet allocation ,Topic modeling ,Control and Systems Engineering ,Computer Science (all) ,business.industry ,Transition (fiction) ,Window (computing) ,Pattern recognition ,Coherence (statistics) ,16. Peace & justice ,symbols ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,computer ,Natural language processing - Abstract
This paper describes a system which uses entity and topic coherence for improved Text Segmentation (TS) accuracy. First, the Linear Dirichlet Allocation (LDA) algorithm was used to obtain topics for sentences in the document. We then performed entity mapping across a window in order to discover the transition of entities within sentences. We used the information obtained to support our LDA-based boundary detection for proper boundary adjustment. We report the significance of the entity coherence approach as well as the superiority of our algorithm over existing work.
- Published
- 2016