Back to Search
Start Over
Prediction of PM2.5 pollution in Tehran air based on temperature and pressure using Markovian regime-switching non-parametric additive transitive regression model
- Source :
- ریاضی و جامعه, Vol 8, Iss 4, Pp 1-21 (2023)
- Publication Year :
- 2023
- Publisher :
- University of Isfahan, 2023.
-
Abstract
- In this paper, we introduce the Markovian regime-switching regression model, which is a graphical model based on the hidden Markov model. This model can be viewed as a clustered regression model, in which a Markov process models the transition from one cluster to another. These clusters are indeed the hidden states of the process, in the hidden Markov model, which are assumed to be a Markov process of order one. Besides, other assumptions of the hidden Markov model are assumed in this model, while the emission distribution is assumed to be the conditional distribution of the response given the covariates and the states. As an application of this model, the problem of prediction of PM 2.5 pollution in Tehran's air based on temperature and pressure during 2015-2017 using the Markovian regime-switching non-parametric additive transitive model, is considered and studied. Furthermore, the package hhsmm in R software, is introduced as a powerful tool for modeling the stated model.1. IntroductionState-switching models are models in which the distribution of a sequence of observations (usually during a time interval) is controlled by a sequence of hidden states, such that the conditional distribution of observations given each state is different from that given others. Hidden Markov and semi-Markov Models [27] are the most common instances of state-switching models, in which the hidden state is a Markov or semi-Markov process. Some other models, including the regime-switching models or Kalman-Filter model, are in this category. Various applications of such models are introduced by the researchers including, speech recognition [12], cognitive learning [24], brain performance modeling [15], modeling environmental processes [4, 5, 6], sequential analysis, reliability theory [7], biological analysis [8, 9, 27], and many other applications.2. Main ResultsA hidden Markov model is constructed by the following items: (1) Transition Probability Matrix $\pmb{\Gamma} =(\gamma_{ij})$, where\begin{equation*}\gamma_{ij}=\Pr(S_{t+1}=j|S_t=i), i,j=1,\ldots,J,\end{equation*}such that\begin{equation*}\sum_{i=1}^{J}{\gamma_{ij}}=1, j=1,\ldots,J.\end{equation*} (2) Initial State Probability $ \pmb{\delta}=(\delta_j) $, where\begin{equation*}\delta_j=\Pr(S_1=j), j=1,\ldots,J; \sum_{j=1}^{J}{\delta_j}=1.\end{equation*} (3) Observation distributions $ f_1(y),\ldots,f_J(y) $, where$$ f_j(y)=\Pr(Y_t=y | S_t=j); j=1,\ldots,J,$$which are also called state-dependent distribution or emission distribution. When $ y_t $ is a continuous random variable, $ f_j(y) $ is a probability density function, which is usually a normal distribution or mixture of normal distributions.The regime-switching regression model is introduced by [14] as follows:(2.1) $$ y_{t} = x_{t}^T \beta_{s_t} + \sigma_{s_t}\epsilon_t,$$in which $\{y_t\}$ is the sequence of responses, $\{x_t\}$ is the sequence of covariates, $\{\epsilon_t\}$ are sequence of (usually) i.i.d. normally distributed errors with zero mean and a variance equal to 1, and $\beta_{s_t}$ and $\sigma_{s_t}$ are the regression coefficients and the standard deviation of errors at state $s_t$, respectively. A generalization of the model (2.1) to the the additive regime-switching regression model is introduced by [20] as follows:(2.2) $$y_{t} = \mu_{s_t} + \sum_{j=1}^p f_{j,s_t}(x_{j,t}) + \sigma_{s_t}\epsilon_t,$$Letting $x_t = (y_{t-\ell},\ldots,y_{t-L}, z_{t-\ell},\ldots,z_{t-L})$, for lags $L > \ell \geq 1$ in (2.2), the non-parametric additive transitive regime-switching regression model is obtained. All models in this paper and all necessary tools for modeling, initialization, fitting, and prediction of these models are included in the R package hhsmm, which can be downloaded from https://cran.r-project.org/package=hhsmm. The reader is also referred to [3] for more information and examples about hhsmm package. 3. Summary of Proofs/ConclusionsThe data set for this paper is obtained from two sources. The AQI data set (PM2.5 values) are obtained from https://airnow.tehran.ir/, while the air temperature and pressure are obtained from Iran meteorological organization. Figure 3, shows the time-series plots of this data set. To visualize the additive regime-switching regression model, we first consider only the temperature as the covariate in the model. Figure 4, presents the prediction of PM2.5 in Tehran city air using a nonparametric additive regime-switching regression model, only based on the air temperature, in each of the four hidden states. The points in each state are colored by different colors and the curve of the prediction is drawn with the same color. As one can see from this figure, the predictive curves are fairly fitted to the points in each state. As a competitor model, we consider the single-state additive regression model. Figure 5, presents the result of the comparison of prediction precision of PM2.5 in Tehran city air, using two fitted models: the regime-switching regression model with four hidden states and the single state non-parametric additive regression model. The mean squared error in each model is presented in each plot. One can see that the non-parametric additive regime-switching regression model with four hidden states performs better than the single-state non-parametric additive regression model. Another introduced model is the additive transitive regime-switching regression model. Figure 6, shows the result of the prediction of PM2.5 in Tehran city air using a transitive regime switching non-parametric additive regression model with a 1-day lag. The mean squared error of the model is presented in the plot. One can see that this model performs better than the two other competitors. Finally, the additive transitive regime-switching model is used for the prediction of the future values of PM2.5. Figure 7, presents the out-of-sample prediction of PM2.5 in Tehran city air using a transitive regime switching non-parametric additive regression model with 10 days lag. The mean squared error of the model is equal to 99.3. The result of the prediction is satisfactory.
Details
- Language :
- Persian
- ISSN :
- 23456493 and 23456507
- Volume :
- 8
- Issue :
- 4
- Database :
- Directory of Open Access Journals
- Journal :
- ریاضی و جامعه
- Publication Type :
- Academic Journal
- Accession number :
- edsdoj.1967adab7b974f24ac0cd6ce976028f5
- Document Type :
- article
- Full Text :
- https://doi.org/10.22108/msci.2023.138405.1593