Back to Search
Start Over
TEXTUAL-BASED CLUSTERING OF WEB DOCUMENTS.
- Source :
-
International Journal of Uncertainty, Fuzziness & Knowledge-Based Systems . Dec2004, Vol. 12 Issue 6, p715-743. 29p. - Publication Year :
- 2004
-
Abstract
- In our study we presented an effective method for clustering of Web pages. From flat HTML files we extracted keywords, formed feature vectors as representation of Web pages and applied them to a clustering method. We took advantage of the Fuzzy C-Means clustering algorithm (FCM), We demonstrated an organized and schematic manner of data collection. Various categories of Web pages were retrieved from ODP (Open Directory Project) in order to create our datasets. The results of clustering proved that the method performs well for all datasets. Finally, we presented a comprehensive experimental study examining: the behavior of the algorithm for different input parameters, internal structure of datasets and classification experiments. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 02184885
- Volume :
- 12
- Issue :
- 6
- Database :
- Academic Search Index
- Journal :
- International Journal of Uncertainty, Fuzziness & Knowledge-Based Systems
- Publication Type :
- Academic Journal
- Accession number :
- 16257974
- Full Text :
- https://doi.org/10.1142/S021848850400317X