Back to Search
Start Over
Cross-document event clustering using knowledge mining from co-reference chains
- Source :
- Information Processing & Management. March, 2007, Vol. 43 Issue 2, p327, 17 p.
- Publication Year :
- 2007
-
Abstract
- To link to full-text access for this article, visit this link: http://dx.doi.org/10.1016/j.ipm.2006.07.016 Byline: June-Jei Kuo, Hsin-Hsi Chen Keywords: Controlled vocabulary; Co-reference chains; Event clustering; Multi-document summarization Abstract: Unifying terminology usages which captures more term semantics is useful for event clustering. This paper proposes a metric of normalized chain edit distance to mine, incrementally, controlled vocabulary from cross-document co-reference chains. Controlled vocabulary is employed to unify terms among different co-reference chains. A novel threshold model that incorporates both time decay function and spanning window uses the controlled vocabulary for event clustering on streaming news. Under correct co-reference chains, the proposed system has a 15.97% performance increase compared to the baseline system, and a 5.93% performance increase compared to the system without introducing controlled vocabulary. Furthermore, a Chinese co-reference resolution system with a chain filtering mechanism is used to experiment on the robustness of the proposed event clustering system. The clustering system using noisy co-reference chains still achieves a 10.55% performance increase compared to the baseline system. The above shows that our approach is promising. Author Affiliation: Department of Computer Science and Information Engineering, National Taiwan University, No. 1, Sec. 4, Roosevelt Road, Taipei 106, Taiwan Article History: Received 16 May 2006; Accepted 25 July 2006
- Subjects :
- Business
Computers and office automation industries
Subjects
Details
- Language :
- English
- ISSN :
- 03064573
- Volume :
- 43
- Issue :
- 2
- Database :
- Gale General OneFile
- Journal :
- Information Processing & Management
- Publication Type :
- Periodical
- Accession number :
- edsgcl.196171445