Back to Search
Start Over
Section-Based Focus Time Estimation of News Articles
- Source :
- IEEE Access, Vol 6, Pp 75452-75460 (2018)
- Publication Year :
- 2018
- Publisher :
- IEEE, 2018.
-
Abstract
- Information retrieval systems embed temporal information for retrieving the news documents related to temporal queries. One of the important aspects of a news document is the focus time , a time to which the content of document refers. The contemporary state-of-the-art does not exploit focus time to retrieve relevant news document. This paper investigates the inverted pyramid news paradigm to determine the focus time of news documents by extracting temporal expressions, normalizing their value and assigning them a score on the basis of their position in the text. In this method, the news documents are first divided into three sections following the inverted pyramid news paradigm. This paper presents a comprehensive analysis of four methods for splitting news document into sections: the paragraph-based method, the words-based method, the sentence-based method, and the semantic-based method (SeBM). Temporal expressions in each section are assigned weights using a linear regression model. Finally, a scoring function is used to calculate a temporal score for each time expression appearing in the document. These temporal expressions are then ranked on the basis of their temporal score, where the most suitable expression appears on top. The effectiveness of the proposed method is evaluated on a diverse dataset of news related to popular events; the results revealed that the proposed splitting methods achieved an average error of less than 5.6 years, whereas the SeBM achieved a high precision score of 0.35 and 0.77 at positions 1 and 2, respectively.
- Subjects :
- news retrieval
General Computer Science
Computer science
Section (typography)
InformationSystems_INFORMATIONSTORAGEANDRETRIEVAL
050801 communication & media studies
02 engineering and technology
Semantics
computer.software_genre
0508 media and communications
0202 electrical engineering, electronic engineering, information engineering
Information retrieval
General Materials Science
Electrical and Electronic Engineering
Focus (computing)
Basis (linear algebra)
business.industry
05 social sciences
General Engineering
Function (mathematics)
Expression (mathematics)
focus time
inverted pyramid
Ranking
temporal information retrieval
020201 artificial intelligence & image processing
Artificial intelligence
lcsh:Electrical engineering. Electronics. Nuclear engineering
Paragraph
business
computer
lcsh:TK1-9971
Natural language processing
Sentence
Subjects
Details
- Language :
- English
- ISSN :
- 21693536
- Volume :
- 6
- Database :
- OpenAIRE
- Journal :
- IEEE Access
- Accession number :
- edsair.doi.dedup.....75b575d09211e7ebbcb63578380650b7