1. A fast approach to identify trending articles in hot topics from XML based big bibliographic datasets.
- Author
-
Swaraj, K. and Manjula, D.
- Subjects
BIG data ,XML (Extensible Markup Language) ,ONTOLOGY ,SOFTWARE frameworks ,TRENDS ,COMPUTER research ,BIBLIOGRAPHY - Abstract
Nowadays XML based big bibliographic datasets are common in different domains which provide meta data about articles published in that domain. They have well defined tags which give details of the year, title, authors, abstract, keywords, the type of article, the venue of publishing the article and other such specific details about each article. A lot of statistics can be extracted from this dataset. Most of the time the tag pertaining to domain sub topic information associated with the article will be absent in the dataset as it is not an article attribute. Hence for such statistics articles must be mapped to its associated sub domain. This paper investigates this problem and proposes a fast approach to find trending articles and hot topics from XML based big bibliographic datasets. The proposed framework uses domain ontology to first classify articles into its sub topics. Fast detection of hot topics, trending keywords and articles is achieved using novel Map Reduce algorithms implemented on a hadoop distributed framework. Performance comparison demonstrates that it outperforms its non-Map Reduce counterpart in quickly sorting out the trending keywords and titles in a particular hot topic from XML based bibliographic dataset. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF