Back to Search
Start Over
Air quality data clustering using EPLS method
- Source :
- RiuNet. Repositorio Institucional de la Universitat Politécnica de Valéncia, instname
- Publication Year :
- 2017
- Publisher :
- Elsevier BV, 2017.
-
Abstract
- [EN] Nowadays air quality data can be easily accumulated by sensors around the world. Analysis on air quality data is very useful for society decision. Among five major air pollutants which are calculated for AQI (Air Quality Index), PM2.5 data is the most concerned by the people. PM2.5 data is also cross-impacted with the other factors in the air and which has properties of non-linear non-stationary including high noise level and outlier. Traditional methods cannot solve the problem of PM2.5 data clustering very well because of their inherent characteristics. In this paper, a novel model-based feature extraction method is proposed to address this issue. The EPLS model includes: (1) Mode Decomposition, in which EEMD algorithm is applied to the aggregation dataset; (2) Dimension Reduction, which is carried out for a more significant set of vectors; (3) Least Squares Projection, in which all testing data are projected to the obtained vectors. Synthetic dataset and air quality dataset are applied to different clustering methods and similarity measures. Experimental results demonstrate that EPLS is efficient in dealing with high noise level and outlier air quality clustering problems, and which can also be adapted to various clustering techniques and distance measures. (C) 2016 Elsevier B.V. All rights reserved.<br />This work was supported in part by the National Natural Science Foundation of China (Nos. 61440018, 61501411), the Hubei Natural Science Foundation (No. 2014CFB904), China Scholarship Council Funding.
- Subjects :
- Computer science
Feature extraction
Correlation clustering
PM2.5
02 engineering and technology
computer.software_genre
Clustering
Distance measures
020204 information systems
0202 electrical engineering, electronic engineering, information engineering
Cluster analysis
Air quality index
PCA
Dimensionality reduction
ARQUITECTURA Y TECNOLOGIA DE COMPUTADORES
EEMD
Hardware and Architecture
Air quality
Signal Processing
Outlier
020201 artificial intelligence & image processing
Data mining
computer
PM25
Software
Information Systems
Test data
Subjects
Details
- ISSN :
- 15662535
- Volume :
- 36
- Database :
- OpenAIRE
- Journal :
- Information Fusion
- Accession number :
- edsair.doi.dedup.....7841edc2ebb8952935ed319af3ceaff2
- Full Text :
- https://doi.org/10.1016/j.inffus.2016.11.015