1. Multivariate Regression Forest for Categorical Attribute Data
- Author
-
LIU Zhen-yu, SONG Xiao-ying
- Subjects
decision trees ,multi-variable regression trees ,ensemble learning ,random forest ,gradient boosting ,Computer software ,QA76.75-76.765 ,Technology (General) ,T1-995 - Abstract
As categorical attributes cannot be utilized directly in some regression models like the linear regression,SVR and most multivariate regression trees,a multivariate split method dealing with multiple types of data is prompted in this paper.We define the centers of the sample sets on the categorical attributes and the distances from the samples to the centers in order that thecate-gorical attributes can also participate in the clustering process like the numerical attributes.Then a reasonable ensemble scheme is selected for the decision trees generated by the method to get the ensemble called cluster regression forest(CRF).Finally,we use CRF and other 9 regression models to compare regression mean absolute error (MAE) and root mean square error (RMSE) on 12 UCI public data sets.The experimental results show that CRF has the best performance among the 10 regression models.
- Published
- 2022
- Full Text
- View/download PDF