1. Random Forest as an Imputation Method for Education and Psychology Research: Its Impact on Item Fit and Difficulty of the Rasch Model
- Author
-
Golino, Hudson F. and Gomes, Cristiano M. A.
- Abstract
This paper presents a non-parametric imputation technique, named random forest, from the machine learning field. The random forest procedure has two main tuning parameters: the number of trees grown in the prediction and the number of predictors used. Fifty experimental conditions were created in the imputation procedure, with different combinations of predictors (from 1 to 10) and number of trees (10, 50, 100, 500 and 1000). We examined how each experimental condition affected the items fit and the difficulty of an inductive reasoning test to the dichotomous Rasch model. The results point that using random forest to impute missing values is a reliable technique to be used in psychological researches, since it led to statistically significant differences in the infit's median only in 4% of the experimental conditions investigated, compared to the original data set result. However, researchers should be aware that in 32% of the experimental conditions, the imputation procedure significantly increased the median of the estimated items' difficulty, compared to the original data set.
- Published
- 2016
- Full Text
- View/download PDF