Back to Search Start Over

A novel user trend‐based priority assigner and URL scheduler for dynamic incremental crawling.

Authors :
Gupta, Ashlesha
Dixit, Ashutosh
Source :
Concurrency & Computation: Practice & Experience; 2/1/2022, Vol. 34 Issue 3, p1-14, 14p
Publication Year :
2022

Abstract

Summary: An efficient search engine needs to be designed in such a way that is able to provide relevant and accurate information in accordance with user needs and interests. The quality of downloaded records can be guaranteed only when website pages of high pertinence are downloaded by the crawlers in accordance with the current topics or user trends. Earlier Focused Crawlers were used to download topic specific pages but these crawlers were not able to adapt to the changing interest of the users. Therefore, there is a need to design crawlers that are able to naturally track the present pattern points and download site pages that meet client's present need. In this paper, a priority assigner and scheduler method for organizing Uniform Resource Locators (URLs) is being proposed that helps the crawler in tracking user's interest and prioritize downloading documents that are relevant to the user's choice in addition to current trends. The experimental results conforms that the proposed priority assigner and URL scheduler‐based crawling outshines conventional crawling strategies based on Change‐history or Site‐Map‐based methods in terms of quality of downloaded web pages and reducing network traffic over the Internet. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
15320626
Volume :
34
Issue :
3
Database :
Complementary Index
Journal :
Concurrency & Computation: Practice & Experience
Publication Type :
Academic Journal
Accession number :
154565430
Full Text :
https://doi.org/10.1002/cpe.6555