Back to Search Start Over

A Bayesian vector autoregression-based data analytics approach to enable irregularly-spaced mixed-frequency traffic collision data imputation with missing values

Authors :
Guohui Zhang
Hao Yu
Zhenning Li
Jun Wang
Source :
Transportation Research Part C: Emerging Technologies. 108:302-319
Publication Year :
2019
Publisher :
Elsevier BV, 2019.

Abstract

Traffic collision data are always collected in irregularly spaced and mixed frequency. Conventional treatment on these kinds of data, for instance, aggregating the high-frequency data into the lower frequency, can lead to the loss of relevant information of high-frequency data, and introduce potential temporal instabilities. A novel Bayesian vector autoregression approach is proposed to address this problem. An unevenly-spaced traffic collision data with missing values, containing all collisions in different severities that occurred on the state highways in Washington State from January 2006 to December 2016, is selected in this study the impacts of transportation-, weather- and socioeconomic-related characteristics on traffic collisions. A Gibbs sampler is used to conduct Bayesian inference for model parameters and unobserved high-frequency variables. Results show that the model has a fairly superior fit accuracy, and is able to capture the unobserved heterogeneity in the dataset. The proposed VAR also demonstrates better performance than other missing value imputation techniques, including linear regression, predictive mean matching, k-nearest neighbors, and random forests. This study provides potential in the guidance of model construction that considers the mixed-time-series nature of data.

Details

ISSN :
0968090X
Volume :
108
Database :
OpenAIRE
Journal :
Transportation Research Part C: Emerging Technologies
Accession number :
edsair.doi...........e289e614c4b6e9cc8d9184e30774ad2b
Full Text :
https://doi.org/10.1016/j.trc.2019.09.013