Back to Search Start Over

Incremental Generation of A Decision Tree Using Global Discretization For Large Data

Authors :
Soo Won Lee
Kyong Sik Han
Source :
The KIPS Transactions:PartB. :487-498
Publication Year :
2005
Publisher :
Korea Information Processing Society, 2005.

Abstract

Recently, It has focused on decision tree algorithm that can handle large dataset. However, because most of these algorithms for large datasets process data in a batch mode, if new data is added, they have to rebuild the tree from scratch. h more efficient approach to reducing the cost problem of rebuilding is an approach that builds a tree incrementally. Representative algorithms for incremental tree construction methods are BOAT and ITI and most of these algorithms use a local discretization method to handle the numeric data type. However, because a discretization requires sorted numeric data in situation of processing large data sets, a global discretization method that sorts all data only once is more suitable than a local discretization method that sorts in every node. This paper proposes an incremental tree construction method that efficiently rebuilds a tree using a global discretization method to handle the numeric data type. When new data is added, new categories influenced by the data should be recreated, and then the tree structure should be changed in accordance with category changes. This paper proposes a method that extracts sample points and performs discretiration from these sample points to recreate categories efficiently and uses confidence intervals and a tree restructuring method to adjust tree structure to category changes. In this study, an experiment using people database was made to compare the proposed method with the existing one that uses a local discretization.

Details

ISSN :
1598284X
Database :
OpenAIRE
Journal :
The KIPS Transactions:PartB
Accession number :
edsair.doi...........183ed227fc0059078c3d9f2c14f25b0e
Full Text :
https://doi.org/10.3745/kipstb.2005.12b.4.487