Back to Search Start Over

Web document clustering using hyperlink structures

Authors :
He, Xiaofeng
Zha, Hongyuan
H.Q. Ding, Chris
D. Simon, Horst
Source :
Computational Statistics & Data Analysis. Nov2002, Vol. 41 Issue 1, p19. 27p.
Publication Year :
2002

Abstract

With the exponential growth of information on the World Wide Web, there is great demand for developing efficient methods for effectively organizing the large amount of retrieved information. Document clustering plays an important role in information retrieval and taxonomy management for the Web. In this paper we examine three clustering methods: K-means, multi-level METIS, and the recently developed normalized-cut method using a new approach of combining textual information, hyperlink structure and co-citation relations into a single similarity metric. We found the normalized-cut method with the new similarity metric is particularly effective, as demonstrated on three datasets of web query results. We also explore some theoretical connections between the normalized-cut method and the K-means method. [Copyright &y& Elsevier]

Details

Language :
English
ISSN :
01679473
Volume :
41
Issue :
1
Database :
Academic Search Index
Journal :
Computational Statistics & Data Analysis
Publication Type :
Periodical
Accession number :
7919342
Full Text :
https://doi.org/10.1016/S0167-9473(02)00070-1