Back to Search
Start Over
Web document clustering using hyperlink structures
- Source :
-
Computational Statistics & Data Analysis . Nov2002, Vol. 41 Issue 1, p19. 27p. - Publication Year :
- 2002
-
Abstract
- With the exponential growth of information on the World Wide Web, there is great demand for developing efficient methods for effectively organizing the large amount of retrieved information. Document clustering plays an important role in information retrieval and taxonomy management for the Web. In this paper we examine three clustering methods: K-means, multi-level METIS, and the recently developed normalized-cut method using a new approach of combining textual information, hyperlink structure and co-citation relations into a single similarity metric. We found the normalized-cut method with the new similarity metric is particularly effective, as demonstrated on three datasets of web query results. We also explore some theoretical connections between the normalized-cut method and the K-means method. [Copyright &y& Elsevier]
- Subjects :
- *WORLD Wide Web
*INFORMATION retrieval
Subjects
Details
- Language :
- English
- ISSN :
- 01679473
- Volume :
- 41
- Issue :
- 1
- Database :
- Academic Search Index
- Journal :
- Computational Statistics & Data Analysis
- Publication Type :
- Periodical
- Accession number :
- 7919342
- Full Text :
- https://doi.org/10.1016/S0167-9473(02)00070-1