1. PM 2.5 Concentration Forecasting Using Weighted Bi-LSTM and Random Forest Feature Importance-Based Feature Selection.
- Author
-
Kim, Baekcheon, Kim, Eunkyeong, Jung, Seunghwan, Kim, Minseok, Kim, Jinyong, and Kim, Sungshin
- Subjects
- *
DEEP learning , *RANDOM forest algorithms , *MACHINE learning , *FEATURE selection , *FORECASTING , *SUPPORT vector machines - Abstract
Particulate matter (PM) in the air can cause various health problems and diseases in humans. In particular, the smaller size of PM 2.5 enable them to penetrate deep into the lungs, causing severe health impacts. Exposure to PM 2.5 can result in respiratory, cardiovascular, and allergic diseases, and prolonged exposure has also been linked to an increased risk of cancer, including lung cancer. Therefore, forecasting the PM 2.5 concentration in the surrounding is crucial for preventing these adverse health effects. This paper proposes a method for forecasting the PM 2.5 concentration after 1 h using bidirectional long short-term memory (Bi-LSTM). The proposed method involves selecting input variables based on the feature importance calculated by random forest, classifying the data to assign weight variables to reduce bias, and forecasting the PM 2.5 concentration using Bi-LSTM. To compare the performance of the proposed method, two case studies were conducted. First, a comparison of forecasting performance according to preprocessing. Second, forecasting performance between deep learning (long short-term memory, gated recurrent unit, and Bi-LSTM) and conventional machine learning models (multi-layer perceptron, support vector machine, decision tree, and random forest). In case study 1, The proposed method shows that the performance indices (RMSE: 3.98%p, MAE: 5.87%p, RRMSE: 3.96%p, and R 2 :0.72%p) are improved because weights are given according to the input variables before the forecasting is performed. In case study 2, we show that Bi-LSTM, which considers both directions (forward and backward), can effectively forecast when compared to conventional models (RMSE: 2.70, MAE: 0.84, RRMSE: 1.97, R 2 : 0.16). Therefore, it is shown that the proposed method can effectively forecast PM 2.5 even if the data in the high-concentration section is insufficient. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF