Back to Search
Start Over
Temporal knowledge extraction from large-scale text corpus
- Source :
- World Wide Web. 24:135-156
- Publication Year :
- 2020
- Publisher :
- Springer Science and Business Media LLC, 2020.
-
Abstract
- Knowledge, in practice, is time-variant and many relations are only valid for a certain period of time. This phenomenon highlights the importance of harvesting temporal-aware knowledge, i.e., the relational facts coupled with their valid temporal interval. Inspired by pattern-based information extraction systems, we resort to temporal patterns to extract time-aware knowledge from free text. However, pattern design is extremely laborious and time consuming even for a single relation, and free text is usually ambiguous which makes temporal instance extraction extremely difficult. Therefore, in this work, we study the problem of temporal knowledge extraction with two steps: (1) temporal pattern extraction by automatically analysing a large-scale text corpus with a small number of seed temporal facts, (2) temporal instance extraction by applying the identified temporal patterns. For pattern extraction, we introduce various techniques, including corpus annotation, pattern generation, scoring and clustering, to improve both accuracy and coverage of the extracted patterns. For instance extraction, we propose a double-check strategy to improve the accuracy and a set of node-extension rules to improve the coverage. We conduct extensive experiments on real world datasets and compared with state-of-the-art systems. Experimental results verify the effectiveness of our proposed methods for temporal knowledge harvesting.
- Subjects :
- Text corpus
Relation (database)
Computer Networks and Communications
Computer science
Interval temporal logic
02 engineering and technology
computer.software_genre
Set (abstract data type)
Information extraction
Annotation
Knowledge extraction
Hardware and Architecture
020204 information systems
0202 electrical engineering, electronic engineering, information engineering
020201 artificial intelligence & image processing
Data mining
Cluster analysis
Scale (map)
computer
Software
Subjects
Details
- ISSN :
- 15731413 and 1386145X
- Volume :
- 24
- Database :
- OpenAIRE
- Journal :
- World Wide Web
- Accession number :
- edsair.doi...........465cb93c9b09ad24b2ca47ca5dfb5b82
- Full Text :
- https://doi.org/10.1007/s11280-020-00836-5