1. Evaluating Fairness and Generalizability in Models Predicting On-Time Graduation from College Applications
- Author
-
Hutt, Stephen, Gardner, Margo, Duckworth, Angela L., and D'Mello, Sidney K.
- Abstract
We explore generalizability and fairness across sociodemographic groups for predicting on-time college graduation using a national dataset of 41,359 college applications. Our features include sociodemographics, institutional graduation rates, academic achievement, standardized test scores, engagement in extracurricular activities, and work experiences. We identify five latent classes based on available sociodemographic data and train Random Forest classifiers to successfully predict 4-year graduation. When individually trained and tested on each class using a split-half validation method, we achieved AUROCs between 0.629 and 0.694. We then evaluate how a model trained on the entire dataset performs on each latent class by performing a slicing analysis, finding a 6 to 10 percent improvement in AUROCs compared to the individual-class models. We explore fairness of our model by extending the slicing analysis to consider Absolute Between ROC Area (ABROCA), finding similar values for each of our latent classes. We contemplate how our results might be used to avoid perpetuating biases inherent in college application data. [Additional funding was provided by the Mindset Scholars Network. For the full proceedings, see ED599096.]
- Published
- 2019