Back to Search Start Over

Clustering of Web Sessions Using Levenshtein Metric

Authors :
Sergey V. Kuznetsov
Andrei Scherbina
Source :
Advances in Data Mining ISBN: 9783540240549, Industrial Conference on Data Mining
Publication Year :
2004
Publisher :
Springer Berlin Heidelberg, 2004.

Abstract

Various commercial and scientific applications require analysis of user behaviour in the Internet. For example, web marketing or network technical support can benefit from web users classification. This is achievable by tracking pages visited by the user during one session (one visit to the particular site). For automated user sessions classification we propose distance that compares sessions judging by the sequence of pages in them and by categories of these pages. Proposed distance is based on Levenshtein metric. Fuzzy C Medoids algorithm was used for clustering, since it has almost linear complexity. Davies-Bouldin, Entropy, and Bezdek validity indices were used to assess the qualities of proposed method. As testing shows, our distance outperforms in this domain both Euclidian and Edit distances.

Details

ISBN :
978-3-540-24054-9
ISBNs :
9783540240549
Database :
OpenAIRE
Journal :
Advances in Data Mining ISBN: 9783540240549, Industrial Conference on Data Mining
Accession number :
edsair.doi...........42d01b4eca9944079a4a302674d582ce
Full Text :
https://doi.org/10.1007/978-3-540-30185-1_14