Back to Search Start Over

Estimation of discrete choice models considering simultaneously multiple objectives and complex data characteristics.

Authors :
Beeramoole, Prithvi Bhat
Kelly, Ryan
Haque, Md Mazharul
Pinz, Alban
Paz, Alexander
Source :
Transportation Research Part C: Emerging Technologies. Mar2024, Vol. 160, pN.PAG-N.PAG. 1p.
Publication Year :
2024

Abstract

• Multi-objective formulation including multiple criteria for hypothesis testing. • Estimation considering simultaneously multiple complex modelling and data aspects. • A metaheuristic and maximum log likelihood to test large numbers of hypothesis. • Comparative benchmarking including multiple data sources and published models. • Capabilities to capture more insights relative to current practice and methods. This paper focuses on the discrete choice estimation problem, which involves multiple objectives and testing a broad range of hypotheses that can affect both interpretability and prediction accuracy. Previous studies have proposed mathematical programming formulations to assist with hypothesis testing and estimation. However, there is limited knowledge regarding the effect of in- and out-of-sample model performance criteria during the search for parsimonious specifications. To address this knowledge gap, a multi-objective optimization framework is proposed, including both in-sample goodness-of-fit and out-of-sample predictive accuracy, to generate multiple unique specifications and perform extensive hypothesis testing considering simultaneously potential explanatory variables, their functional forms, nonlinearities, heterogeneous effects, and correlations. A metaheuristic was designed and implemented to solve the proposed multi-objective nonlinear mixed-integer mathematical programming problem. Experiments, including various datasets and discrete choices, were used to illustrate the efficacy of the proposed framework. The goal was to find specifications that are either similar or dominate those reported in literature, considering both interpretability and prediction accuracy. Important insights regarding potential explanatory factors and heterogeneous preferences, which were not reported in literature, were captured using the proposed framework. In addition, for one of the datasets used in this study, the proposed framework enabled the discovery of three distinct clusters considering specification type and model performance in terms of interpretability and prediction accuracy. For the given dataset, these clusters suggest that the proposed approach allowed extensive exploration of the data across different specification types. In addition, the Mixed-Logit models with correlated parameters were found to perform significantly better in terms of in-sample fit than those without correlation. Similarly, multinomial-Logit models showed the worst performance for the given dataset. In contrast, multinomial-Logit models provided superior out-of-sample fit relative to advanced specifications, which illustrates trade-offs between model in- and out-of-sample fitness. A comparative analysis, including multiple performance measures, was also conducted. The results suggest that model evaluation using in-sample Bayesian Information Criterion (BIC) and out-of-sample Mean Absolute Error (MAE), and in-sample BIC and out-of-sample Mean Squared Error (MSE) enables estimation of specifications with better in- and out-of-sample performance compared to those estimated using maximum log-likelihood and minimum number of model parameters. In addition, a mostly linear relationship was observed between in-sample and out-of-sample log-likelihood, indicating that the latter does not provide much additional information regarding prediction compared to the in-sample estimates. These results showed the value of using an optimization framework to support modelling decisions by enabling extensive hypothesis testing and including multiple performance criteria as well as complex data characteristics to discover important and reliable insights. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
0968090X
Volume :
160
Database :
Academic Search Index
Journal :
Transportation Research Part C: Emerging Technologies
Publication Type :
Academic Journal
Accession number :
175936394
Full Text :
https://doi.org/10.1016/j.trc.2024.104517