Back to Search Start Over

An implementation of cloud-based platform with R packages for spatiotemporal analysis of air pollution.

Authors :
Yang, Chao-Tung
Chan, Yu-Wei
Liu, Jung-Chun
Lou, Ben-Shen
Source :
Journal of Supercomputing; Mar2020, Vol. 76 Issue 3, p1416-1437, 22p
Publication Year :
2020

Abstract

Recently, the R package has become a popular tool for big data analysis due to its several matured software packages for the data analysis and visualization, including the analysis of air pollution. The air pollution problem is of increasing global concern as it has greatly impacts on the environment and human health. With the rapid development of IoT and the increase in the accuracy of geographical information collected by sensors, a huge amount of air pollution data were generated. Thus, it is difficult to analyze the air pollution data in a single machine environment effectively and reliably due to its inherent characteristic of memory design. In this work, we construct a distributed computing environment based on both the softwares of RHadoop and SparkR for performing the analysis and visualization of air pollution with the R more reliably and effectively. In the work, we firstly use the sensors, called EdiGreen AirBox to collect the air pollution data in Taichung, Taiwan. Then, we adopt the Inverse Distance Weighting method to transform the sensors' data into the density map. Finally, the experimental results show the accuracy of the short-term prediction results of PM2.5 by using the ARIMA model. In addition, the verification with respect to the prediction accuracy with the MAPE method is also presented in the experimental results. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
09208542
Volume :
76
Issue :
3
Database :
Complementary Index
Journal :
Journal of Supercomputing
Publication Type :
Academic Journal
Accession number :
142316658
Full Text :
https://doi.org/10.1007/s11227-017-2189-1