Back to Search
Start Over
Maximizing the Diversity of Ensemble Random Forests for Tree Genera Classification Using High Density LiDAR Data
- Source :
- Remote Sensing, Volume 8, Issue 8, Pages: 646, Remote Sensing, Vol 8, Iss 8, p 646 (2016)
- Publication Year :
- 2016
- Publisher :
- Multidisciplinary Digital Publishing Institute, 2016.
-
Abstract
- Recent research into improving the effectiveness of forest inventory management using airborne LiDAR data has focused on developing advanced theories in data analytics. Furthermore, supervised learning as a predictive model for classifying tree genera (and species, where possible) has been gaining popularity in order to minimize this labor-intensive task. However, bottlenecks remain that hinder the immediate adoption of supervised learning methods. With supervised classification, training samples are required for learning the parameters that govern the performance of a classifier, yet the selection of training data is often subjective and the quality of such samples is critically important. For LiDAR scanning in forest environments, the quantification of data quality is somewhat abstract, normally referring to some metric related to the completeness of individual tree crowns; however, this is not an issue that has received much attention in the literature. Intuitively the choice of training samples having varying quality will affect classification accuracy. In this paper a Diversity Index (DI) is proposed that characterizes the diversity of data quality (Qi) among selected training samples required for constructing a classification model of tree genera. The training sample is diversified in terms of data quality as opposed to the number of samples per class. The diversified training sample allows the classifier to better learn the positive and negative instances and; therefore; has a higher classification accuracy in discriminating the “unknown” class samples from the “known” samples. Our algorithm is implemented within the Random Forests base classifiers with six derived geometric features from LiDAR data. The training sample contains three tree genera (pine; poplar; and maple) and the validation samples contains four labels (pine; poplar; maple; and “unknown”). Classification accuracy improved from 72.8%; when training samples were selected randomly (with stratified sample size); to 93.8%; when samples were selected with additional criteria; and from 88.4% to 93.8% when an ensemble method was used.
- Subjects :
- random forests
LiDAR
010504 meteorology & atmospheric sciences
Computer science
0211 other engineering and technologies
02 engineering and technology
Machine learning
computer.software_genre
01 natural sciences
lcsh:Science
021101 geological & geomatics engineering
0105 earth and related environmental sciences
Training set
Forest inventory
business.industry
Supervised learning
tree genera classification
Pattern recognition
15. Life on land
Random forest
Stratified sampling
Lidar
Data quality
General Earth and Planetary Sciences
lcsh:Q
Artificial intelligence
ensemble classification
business
diversity maximization
Classifier (UML)
computer
Subjects
Details
- Language :
- English
- ISSN :
- 20724292
- Database :
- OpenAIRE
- Journal :
- Remote Sensing
- Accession number :
- edsair.doi.dedup.....895360f718ec03e421e8f609bda439d6
- Full Text :
- https://doi.org/10.3390/rs8080646