Back to Search
Start Over
The NAEP EDM Competition: On the Value of Theory-Driven Psychometrics and Machine Learning for Predictions Based on Log Data
- Source :
-
International Educational Data Mining Society . 2020. - Publication Year :
- 2020
-
Abstract
- The "2nd Annual WPI-UMASS-UPENN EDM Data Mining Challenge" required contestants to predict efficient testtaking based on log data. In this paper, we describe our theory-driven and psychometric modeling approach. For feature engineering, we employed the Log-Normal Response Time Model for estimating latent person speed, and the Generalized Partial Credit Model for estimating latent person ability. Additionally, we adopted an n-gram feature approach for event sequences. For training a multi-label classifier, we distinguished inefficient test takers who were going too fast and those who were going too slow, instead of using the provided binary target label. Our best-performing ensemble classifier comprised three sets of low-dimensional classifiers, dominated by test-taker speed. While our classifier reached moderate performance, relative to competition leaderboard, our approach makes two important contributions. First, we show how explainable classifiers could provide meaningful predictions if results can be contextualized to test administrators who wish to intervene or take action. Second, our re-engineering of test scores enabled us to incorporate person ability into the estimation. However, ability was hardly predictive of efficient behavior, leading to the conclusion that the target label's validity needs to be questioned. The paper concludes with tools that are helpful for substantively meaningful log data mining. [For the full proceedings, see ED607784.]
Details
- Language :
- English
- Database :
- ERIC
- Journal :
- International Educational Data Mining Society
- Publication Type :
- Conference
- Accession number :
- ED608068
- Document Type :
- Speeches/Meeting Papers<br />Reports - Research