Back to Search Start Over

Research on Topic Crawler Strategy Based on Web Page Extension and Best Priority Search Algorithm

Authors :
Ganglong Fan
Yanqing Lv
Hongsheng Xu
Source :
Advances in Intelligent Systems and Computing ISBN: 9783319987750
Publication Year :
2018
Publisher :
Springer International Publishing, 2018.

Abstract

The topic crawler filters the topic independent links according to a certain web page analysis algorithm, keeps the topic related links and puts them into the URL queue to be fetched. Then, according to a certain search strategy, the next page URL is selected from the queue and repeated until a certain condition of the system is reached. The best priority search strategy predicts the similarity between candidate URL and target pages according to a certain web page analysis algorithm, and selects one or more URL which is the best evaluation to grab. The crawler algorithm based on Web page extension is to evaluate the web page or website with indirect relation through known web pages or data. The paper presents research on topic crawler strategy based on web page extension and best priority search algorithm.

Details

ISBN :
978-3-319-98775-0
ISBNs :
9783319987750
Database :
OpenAIRE
Journal :
Advances in Intelligent Systems and Computing ISBN: 9783319987750
Accession number :
edsair.doi...........34413836bff7161f3dd3c14d7f13e2f8
Full Text :
https://doi.org/10.1007/978-3-319-98776-7_138