Back to Search Start Over

Integrating OpenStreetMap crowdsourced data and Landsat time-series imagery for rapid land use/land cover (LULC) mapping: Case study of the Laguna de Bay area of the Philippines

Authors :
Kotaro Iizuka
Brian Alan Johnson
Source :
Applied Geography. 67:140-149
Publication Year :
2016
Publisher :
Elsevier BV, 2016.

Abstract

We explored the potential for rapid land use/land cover (LULC) mapping using time-series Landsat satellite imagery and training data (for supervised classification) automatically extracted from crowdsourced OpenStreetMap (OSM) “landuse” (OSM-LU) and “natural” (OSM-N) polygon datasets. The main challenge with using these data for LULC classification was their high level of noise, as the Landsat images all contained varying degrees of cloud cover (causes of attribute noise) and the OSM polygons contained locational errors and class labeling errors (causes of class noise). A second challenge arose from the imbalanced class distribution in the extracted training data, which occurred due to wide discrepancies in the area coverage of each OSM-LU/OSM-N class. To address the first challenge, three relatively noise-tolerant algorithms – naive bayes (NB), decision tree (C4.5 algorithm), and random forest (RF) – were evaluated for image classification. To address the second challenge, artificial training samples were generated for the minority classes using the synthetic minority over-sampling technique (SMOTE). Image classification accuracies were calculated for a four-class, five-class, and six-class LULC system to assess the capability of the proposed methods for mapping relatively broad as well as more detailed LULC types, and the highest overall accuracies achieved were 84.0% (four-class SMOTE-RF result), 81.0% (five-class SMOTE-RF result), and 72.0% (six-class SMOTE-NB result). RF and NB had relatively similar overall accuracies, while those of C4.5 were much lower. SMOTE led to higher classification accuracies for RF and C4.5, and in some cases for NB, despite the noise in the training set. The main advantages of the proposed methods are their cost- and time-efficiency, as training data for supervised classification is automatically extracted from the crowdsourced datasets and no pre-processing for cloud detection/cloud removal is performed.

Details

ISSN :
01436228
Volume :
67
Database :
OpenAIRE
Journal :
Applied Geography
Accession number :
edsair.doi...........96d8bc375776c3b75a98ff1205f06128