1. Efficient data mining for maximal frequent subtrees
- Author
-
J.-F. Yao and Y. Xiao
- Subjects
Fractal tree index ,business.industry ,Frequent subtree mining ,Pattern recognition ,T-tree ,Interval tree ,computer.software_genre ,Search tree ,Tree (data structure) ,Tree traversal ,Tree structure ,Artificial intelligence ,Data mining ,business ,computer ,Mathematics - Abstract
A new type of tree mining is defined, which uncovers maximal frequent induced subtrees from a database of unordered labeled trees. A novel algorithm, PathJoin, is proposed. The algorithm uses a compact data structure, FST-Forest, which compresses the trees and still keeps the original tree structure. PathJoin generates candidate subtrees by joining the frequent paths in FST-Forest. Such candidate subtree generation is localized and thus substantially reduces the number of candidate subtrees. Experiments with synthetic data sets show that the algorithm is effective and efficient.
- Published
- 2004
- Full Text
- View/download PDF