Author: "Tien Bui, Dieu" / Topic: machine learning - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Tien Bui, Dieu"' showing total 75 results

Start Over Author "Tien Bui, Dieu" Topic machine learning

75 results on '"Tien Bui, Dieu"'

1. Application of Classification and Regression Trees for Spatial Prediction of Rainfall-Induced Shallow Landslides in the Uttarakhand Area (India) Using GIS

Author: Pham, Binh Thai, Tien Bui, Dieu, Prakash, Indra, Singh, R.B., Series editor, Mal, Suraj, Series editor, Meadows, Michael E., Series editor, and Huggel, Christian, editor
Published: 2018
Full Text: View/download PDF

2. Bagging based Support Vector Machines for spatial prediction of landslides

Author: Pham, Binh Thai, Tien Bui, Dieu, and Prakash, Indra
Published: 2018
Full Text: View/download PDF

3. Development of a novel hybrid multi-boosting neural network model for spatial prediction of urban flood.

Author: Darabi, Hamid, Rahmati, Omid, Naghibi, Seyed Amir, Mohammadi, Farnoush, Ahmadisharaf, Ebrahim, Kalantari, Zahra, Torabi Haghighi, Ali, Soleimanpour, Seyed Masoud, Tiefenbacher, John P., and Tien Bui, Dieu
Subjects: ARTIFICIAL neural networks, FLOOD risk, FLOOD control, REGRESSION trees, FLOODS, MACHINE learning
Abstract: In this study, a new hybridized machine learning algorithm for urban flood susceptibility mapping, named MultiB-MLPNN, was developed using a multi-boosting technique and MLPNN. The model was tested in Amol City, Iran, a data-scarce city in an ungauged area which is prone to severe flood inundation events and currently lacks flood prevention infrastructure. Performance of the hybridized model was compared with that of a standalone MLPNN model, random forest and boosted regression trees. Area under the curve, efficiency, true skill statistic, Matthews correlation coefficient, misclassification rate, sensitivity and specificity were used to evaluate model performance. In validation, the MultiB-MLPNN model showed the best predictive performance. The hybridized MultiB-MLPNN model is thus useful for generating realistic flood susceptibility maps for data-scarce urban areas. The maps can be used to develop risk-reduction measures to protect urban areas from devastating floods, particularly where available data are insufficient to support physically based hydrological or hydraulic models. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

4. Spatial pattern assessment of tropical forest fire danger at Thuan Chau area (Vietnam) using GIS-based advanced machine learning algorithms: A comparative study

Author: Tien Bui Dieu, Nguyen Ngoc Thach, Nguyen Hong-Thi, Pham Xuan-Canh, Hoang Nhat-Duc, Dang Bao-Toan Ngo, and Bui Hang Thi
Subjects: 010504 meteorology & atmospheric sciences, Wilcoxon signed-rank test, Forest management, 0211 other engineering and technologies, 02 engineering and technology, Machine learning, computer.software_genre, 01 natural sciences, symbols.namesake, Cohen's kappa, Fire protection, Ecology, Evolution, Behavior and Systematics, 021101 geological & geomatics engineering, 0105 earth and related environmental sciences, Ecology, Artificial neural network, business.industry, Applied Mathematics, Ecological Modeling, Pearson product-moment correlation coefficient, Computer Science Applications, Random forest, Geography, Computational Theory and Mathematics, Modeling and Simulation, symbols, Spatial ecology, Artificial intelligence, business, Algorithm, computer
Abstract: Thuan Chau is a serious district affected by forest fire in Vietnam, especially in 2016; however, no forest fire prediction research has been conducted for this region. Thus, knowledge of spatial patterns of fire danger of the district plays a key role in forest succession and ecological implications. This study's aim was to analyze the spatial pattern of fire danger for the tropical forest of Thuan Chau district using advanced machine learning algorithms, Support Vector Machine classifier (SVMC), Random Forests (RF), and Multilayer Perceptron Neural Network (MLP-Net). For this purpose, a GIS database for the study area was established with 564 forest fire locations and ten forest fire variables. Then, Pearson correlation method was used to assess the correlation of the variables with the forest fire. In the next step, three forest fire danger models, SVMC, RF, and MLP-Net, were trained and validated. Finally, global performance of these models was assessed using the classification accuracy (ACC), Kappa statistics (KS), Area under the curve (AUC). In addition, Wilcoxon signed-rank test was employed to check the prediction performance of these models. The result shows the three models performed well; however, the MLP-Net model has the highest prediction performance (ACC = 81.7, KS = 0.633, and AUC = 0.894), followed by the RF model (ACC = 81.1, KS = 0.621, and AUC = 0.883), and the SVMC model (ACC = 80.2, KS = 0.604, and AUC = 0.867). The result in this study is useful for the local authority and forest manager in forest management and fire suppression.
Published: 2018

5. A tree-based intelligence ensemble approach for spatial prediction of potential groundwater.

Author: Avand, Mohammadtaghi, Janizadeh, Saeid, Tien Bui, Dieu, Pham, Viet Hoa, Ngo, Phuong Thao T., and Nhu, Viet-Ha
Subjects: GROUNDWATER, RANDOM forest algorithms, FORECASTING, MACHINE learning, LAND use
Abstract: The objective of this research is to propose and confirm a new machine learning approach of Best-First tree (BFtree), AdaBoost (AB), MultiBoosting (MB), and Bagging (Bag) ensembles for potential groundwater mapping and assessing role of influencing factors. The Yasuj-Dena area (Iran) is selected as a case study. For this regard, a Yasuj-Dena database was established with 362 springs locations and 12 groundwater-influencing factors (slope, aspect, elevation, stream power index (SPI), length of slope (LS), topographic wetness index (TWI), topographic position index (TPI), land use, lithology, distance from fault, distance from river, and rainfall). The database was employed to train and validate the proposed groundwater models. The area under the curve (AUC) and statistical metrics were employed to check and confirm the quality of the models. The result shows that the BFTree-Bag model (AUC = 0.810, kappa = 0.495) has the highest prediction performance, followed by the BFTree-MB model (AUC = 0.785, kappa = 0.477), and the BFTree-MB model (AUC = 0.745, kappa = 0.422). Compared to the benchmark of Random Forests, the BFTree-Bag model performs better; therefore, we conclude that the BFtree-Bag is a new tool should be used for modeling of groundwater potential. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

6. A novel hybrid approach of Bayesian Logistic Regression and its ensembles for landslide susceptibility assessment.

Author: Abedini, Mousa, Ghasemian, Bahareh, Shirzadi, Ataollah, Shahabi, Himan, Chapi, Kamran, Pham, Binh Thai, Bin Ahmad, Baharin, and Tien Bui, Dieu
Subjects: LANDSLIDES, LOGISTIC regression analysis, RECEIVER operating characteristic curves, ARTIFICIAL intelligence
Abstract: A novel artificial intelligence approach of Bayesian Logistic Regression (BLR) and its ensembles [Random Subspace (RS), Adaboost (AB), Multiboost (MB) and Bagging] was introduced for landslide susceptibility mapping in a part of Kamyaran city in Kurdistan Province, Iran. A spatial database was generated which includes a total of 60 landslide locations and a set of conditioning factors tested by the Information Gain Ratio technique. Performance of these models was evaluated using the area under the ROC curve (AUROC) and statistical index-based methods. Results showed that the hybrid ensemble models could significantly improve the performance of the base classifier of BLR (AUROC = 0.930). However, RS model (AUROC = 0.975) had the highest performance in comparison to other landslide ensemble models, followed by Bagging (AUROC = 0.972), MB (AUROC = 0.970) and AB (AUROC = 0.957) models, respectively. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

7. Spatial prediction of flood potential using new ensembles of bivariate statistics and artificial intelligence: A case study at the Putna river catchment of Romania.

Author: Costache, Romulus and Tien Bui, Dieu
Abstract: Flash-flood is considered to be one of the most destructive natural hazards in the world, which is difficult to accurately model and predict. The objective of the present research is to propose new ensembles of bivariate statistics and artificial intelligences and to introduce a comprehensive methodology for predicting flood susceptibility. The Putna river catchment of Romania is selected as a case study. In this regard, a total of six ensemble models were proposed and verified: Multilayer Perceptron neural network-Frequency Ratio (MLP-FR), Multilayer Perceptron neural network -Weights of Evidence (MLP-WOE), Rotation Forest-Frequency Ratio (RF-FR), Rotation Forest-Weights of Evidence (RF-WOE), Classification and Regression Tree-Frequency Ratio (CART-FR), and Classification and Regression Tree-Weights of Evidence (CART-WOE). In a first step, a geospatial database was created for the study area. This database includes 132 flood locations and 14 conditioning factors (lithology, slope angle, plan curvature, hydrological soil group, topographic wetness index, landuse, convergence index, elevation, distance from river, profile curvature, rainfall, aspect, stream power index, and topographic position index). In the next step, the Information Gain Ratio was used to evaluate the predictive ability of these factors. Subsequently, the database was used to train and validate the six ensemble models. The Receiver operating characteristic (ROC) curve, area under the curve (AUC), and statistical measures were used to evaluate the performance of the models. The results show that the prediction capability of the proposed ensemble models varied from 86.8% (the RF-FR model) to 93.9% (the RF-WOE model). These values indicate a high prediction performance for all the models. Therefore, we can state that the proposed ensemble models are new reliable tools which can be used for flood susceptibility modelling. Unlabelled Image • New six artificial intelligence ensemble models were proposed for food modelling. • 14 conditioning factors were considered as predictors for flood potential. • All ensemble models have high prediction performance. • MLP-WOE and RF-FR have the best prediction performance (>91%). • Slope is the most important factor for flood occurrence. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

8. Adaptive Network Based Fuzzy Inference System with Meta-Heuristic Optimizations for International Roughness Index Prediction.

Author: Nguyen, Hoang-Long, Pham, Binh Thai, Son, Le Hoang, Thang, Nguyen Trung, Ly, Hai-Bang, Le, Tien-Thinh, Ho, Lanh Si, Le, Thanh-Hai, and Tien Bui, Dieu
Subjects: ADAPTIVE fuzzy control, MATHEMATICAL optimization, FUZZY systems, PARTICLE swarm optimization, STANDARD deviations, ARTIFICIAL neural networks
Abstract: The International Roughness Index (IRI) is the one of the most important roughness indexes to quantify road surface roughness. In this paper, we propose a new hybrid approach between adaptive network based fuzzy inference system (ANFIS) and various meta-heuristic optimizations such as the genetic algorithm (GA), particle swarm optimization (PSO), and the firefly algorithm (FA) to develop several hybrid models namely GA based ANGIS (GANFIS), PSO based ANFIS (PSOANFIS), FA based ANFIS (FAANFIS), respectively, for the prediction of the IRI. A benchmark model named artificial neural networks (ANN) was also used to compare with those hybrid models. To do this, a total of 2811 samples in the case study of the north of Vietnam (Northwest region, Northeast region, and the Red River Delta Area) within the scope of management of the DRM-I Department were used to validate the models in terms of various criteria like coefficient of determination (R) and the root mean square error (RMSE). Experimental results affirmed the potentiality and effectiveness of the proposed prediction models whereas the PSOANFIS (RMSE = 0.145 and R = 0.888) is better than the other models named GANFIS (RMSE = 0.155 and R = 0.872), FAANFIS (RMSE = 0.170 and R = 0.849), and ANN (RMSE = 0.186 and R = 0.804). The results of this study are helpful for accurate prediction of the IRI for evaluation of quality of road surface roughness. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

9. Machine-Learning-Based Classification Approaches toward Recognizing Slope Stability Failure.

Author: Moayedi, Hossein, Tien Bui, Dieu, Kalantar, Bahareh, and Kok Foong, Loke
Subjects: SLOPE stability, RADIAL basis functions, SUPPORT vector machines, SAFETY factor in engineering, MATHEMATICAL optimization, REGRESSION analysis, EARTH dams
Abstract: In this paper, the authors investigated the applicability of combining machine-learning-based models toward slope stability assessment. To do this, several well-known machine-learning-based methods, namely multiple linear regression (MLR), multi-layer perceptron (MLP), radial basis function regression (RBFR), improved support vector machine using sequential minimal optimization algorithm (SMO-SVM), lazy k-nearest neighbor (IBK), random forest (RF), and random tree (RT), were selected to evaluate the stability of a slope through estimating the factor of safety (FOS). In the following, a comparative classification was carried out based on the five stability categories. Based on the respective values of total scores (the summation of scores obtained for the training and testing stages) of 15, 35, 48, 15, 50, 60, and 57, acquired for MLR, MLP, RBFR, SMO-SVM, IBK, RF, and RT, respectively, it was concluded that RF outperformed other intelligent models. The results of statistical indexes also prove the excellent prediction from the optimized structure of the ANN and RF techniques. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

10. A novel ensemble modeling approach for the spatial prediction of tropical forest fire susceptibility using LogitBoost machine learning classifier and multi-source geospatial data.

Author: Tehrany, Mahyat Shafapour, Jones, Simon, Shabani, Farzin, Martínez-Álvarez, Francisco, and Tien Bui, Dieu
Subjects: FOREST fires, WILDFIRES, GEOSPATIAL data, TROPICAL forests, FOREST management, MACHINE learning, EMERGENCY management
Abstract: A reliable forest fire susceptibility map is a necessity for disaster management and a primary reference source in land use planning. We set out to evaluate the use of the LogitBoost ensemble-based decision tree (LEDT) machine learning method for forest fire susceptibility mapping through a comparative case study at the Lao Cai region of Vietnam. A thorough literature search would indicate the method has not previously been applied to forest fires. Support vector machine (SVM), random forest (RF), and Kernel logistic regression (KLR) were used as benchmarks in the comparative evaluation. A fire inventory database for the study area was constructed based on data of previous forest fire occurrences, and related conditioning factors were generated from a number of sources. Thereafter, forest fire probability indices were computed through each of the four modeling techniques, and performances were compared using the area under the curve (AUC), Kappa index, overall accuracy, specificity, sensitivity, positive predictive value (PPV), and negative predictive value (NPV). The LEDT model produced the best performance, both on the training and on validation datasets, demonstrating a 92% prediction capability. Its overall superiority over the benchmarking models suggests that it has the potential to be used as an efficient new tool for forest fire susceptibility mapping. Fire prevention is a critical concern for local forestry authorities in tropical Lao Cai region, and based on the evidence of our study, the method has a potential application in forestry conservation management. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

11. A swarm intelligence-based machine learning approach for predicting soil shear strength for road construction: a case study at Trung Luong National Expressway Project (Vietnam).

Author: Tien Bui, Dieu, Hoang, Nhat-Duc, and Nhu, Viet-Ha
Subjects: SWARM intelligence, SHEAR strength of soils, MACHINE learning, PARAMETER estimation, SUPPORT vector machines
Abstract: Determining the shear strength of soil is an important task in the design phase of construction project. This study puts forward an artificial intelligence (AI) solution to estimate this parameter of soil. The proposed approach is a hybrid AI model that integrates the least squares support vector machine (LSSVM) and the cuckoo search optimization (CSO). A dataset of 332 soil samples collected from the Trung Luong National Expressway Project in Viet Nam have been used for constructing and validating the AI model. The sample depth, sand percentage, loam percentage, clay percentage, moisture content, wet density of soil, specific gravity, liquid limit, plastic limit, plastic index, and liquid index are used as input variables to predict the output variable of shear strength. In the hybrid AI framework, LSSVM is employed to generalize the functional mapping that estimates the shear strength from the information provided by the aforementioned input variables. Since the model establishment of LSSVM requires a proper setting of the regularization and the kernel function parameters, the CSO algorithm is utilized to automatically determine these parameters. Experimental results show that the prediction accuracy of the hybrid method of LSSVM and CSO (RMSE = 0.082, MAPE = 14.841, and R2 = 0.885) is better than those of the benchmark approaches including the standard LSSVM, the artificial neural network, and the regression tree. Therefore, the proposed method is a promising alternative for assisting construction engineers in the task of soil shear strength estimation. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

12. Spatial pattern analysis and prediction of forest fire using new machine learning approach of Multivariate Adaptive Regression Splines and Differential Flower Pollination optimization: A case study at Lao Cai province (Viet Nam).

Author: Tien Bui, Dieu, Hoang, Nhat-Duc, and Samui, Pijush
Subjects: *FOREST fires, *GEOGRAPHIC spatial analysis, *MACHINE learning, *RADIAL basis functions, *WILCOXON signed-rank test, *POLLINATION
Abstract: Abstract Understanding spatial patterns of forest fire is of key important for fire danger management and ecological implication. This aim of this study was to propose a new machine learning methodology for analyzing and predicting spatial patterns of forest fire danger with a case study of tropical forest fire at Lao Cai province (Vietnam). For this purpose, a Geographical Information System (GIS) database for the study area was established, including ten influencing factors (slope, aspect, elevation, land use, distance to road, normalized difference vegetation index, rainfall, temperature, wind speed, and humidity) and 257 fire locations. The relevance level of these factors with the forest fire was analyzed and assessed using the Mutual Information algorithm. Then, a new hybrid artificial intelligence model named as MARS-DFP, which was Multivariate Adaptive Regression Splines (MARS) optimized by Differential Flower Pollination (DFP), was proposed and used construct forest fire model for generating spatial patterns of forest fire. MARS is employed to build the forest fire model for generalizing a classification boundary that distinguishes fire and non-fire areas, whereas DFP, a metaheuristic approach, was utilized to optimize the model. Finally, global prediction performance of the model was assessed using Area Under the curve (AUC), Classification Accuracy Rate (CAR), Wilcoxon signed-rank test, and various statistical indices. The result demonstrated that the predictive performance of the MARS-DFP model was high (AUC = 0.91 and CAR = 86.57%) and better to those of other benchmark methods, Backpropagation Artificial Neural Network, Adaptive neuro fuzzy inference system, Radial Basis Function Neural Network. This fact confirms that the newly constructed MARS-DFP model is a promising alternative for spatial prediction of forest fire susceptibility. Highlights • MARS-DFP is proposed for analyzing and predicting spatial patterns of forest fire risk. • MARS-DFP has high performance, providing >90% prediction accuracy for future forest fire. • MARS-DFP outperforms benchmark models i.e. ANN and ANFIS. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

13. GIS-based spatial prediction of tropical forest fire danger using a new hybrid machine learning method.

Author: Tien Bui, Dieu, Le, Hung Van, and Hoang, Nhat-Duc
Subjects: FOREST fires, GEOGRAPHIC information systems, MACHINE learning, ARTIFICIAL neural networks, METAHEURISTIC algorithms, COMBINATORIAL optimization, SEARCH algorithms
Abstract: Abstract Forest fire danger map at regional scale is considered of utmost importance for local authority to efficiently allocate its resources to fire prevention measures and establish appropriate land use plans. This study aims at introduce a new machine learning method, named as DFP-MnBpAnn, based on Artificial Neural Network (Ann) with a novel hybrid training algorithm of Differential Flower Pollination (DFP) and mini-match backpropagation (MnBp) for spatial modeling of forest fire danger. Tropical forest of the Lam Dong province (Vietnam) was used as case study. To achieve this task, a Geographical Information System (GIS) database of the forest fire for the study area was established. Accordingly, DFP, as a metaheuristic method, is used to optimize the weights and structure of Ann to fit the GIS database at hand. Whereas, MnBp is employed periodically during the DFP-based optimization process, in which MnBp acts as a local search aiming to accelerate both the quality of the found solutions and the convergence rate. Experimental outcomes demonstrate that the proposed DFP-MnBpAnn model is superior to other benchmark methods with satisfactory prediction accuracy (Classification Accuracy Rate = 88.43%). This fact confirms that DFP-MnBpAnn is a promising alternative for the problem of large-scale forest fire danger mapping. Highlights • DFP-MnBpAnn is proposed for forest fire modeling. • DFP-MnBpAnn has high performance on the training and validation datasets. • DFP-MnBpAnn outperforms benchmark models i.e. PSO-NF, BpANN, SVM, LSSVM, and RF. [ABSTRACT FROM AUTHOR]
Published: 2018
Full Text: View/download PDF

14. Prediction of soil compression coefficient for urban housing project using novel integration machine learning approach of swarm intelligence and Multi-layer Perceptron Neural Network.

Author: Tien Bui, Dieu, Nhu, Viet-Ha, and Hoang, Nhat-Duc
Subjects: *SOIL compaction, *HOUSING, *MACHINE learning, *SWARM intelligence, *ARTIFICIAL neural networks
Abstract: Abstract In many engineering projects, the soil compression coefficient is an important parameter used for estimating the settlement of soil layers. The common practice of determining the soil compression coefficient via the oedometer test is time-consuming and expensive. This study proposes a machine learning solution to replace the conventional tests used for obtaining the coefficient of soil compression. The new approach is an integration of the Multi-Layer Perceptron Neural Network (MLP Neural Nets) and Particle Swarm Optimization (PSO). These two computational intelligence methods work synergistically to establish a prediction model of soil compression coefficient. The PSO metaheuristic is employed to optimize the MLP Neural Nets model structure. To train and validate the proposed method, named as PSO-MLP Neural Nets, a dataset of 154 soil samples featuring 12 influencing factors has been collected from the geotechnical investigation process of a high-rise building project. Experimental results show that the proposed PSO-MLP Neural Nets has attained the most accurate prediction of the soil compression coefficient performance with RMSE = 0.0267, MAE = 0.0145, and R2 = 0.884. The result of the proposed model is significantly better than those obtained from other benchmark methods including the backpropagation neural network, the radial basis function neural network, the support vector regression, the random forest, and the Gaussian process. Based on the experimental results, the newly constructed PSO-MLP Neural Nets is very potential to be a new alternative to assist geotechnical engineers in design phase of civil engineering projects. [ABSTRACT FROM AUTHOR]
Published: 2018
Full Text: View/download PDF

15. Enhancing Prediction Performance of Landslide Susceptibility Model Using Hybrid Machine Learning Approach of Bagging Ensemble and Logistic Model Tree.

Author: Truong, Xuan Luan, Mitamura, Muneki, Kono, Yasuyuki, Raghavan, Venkatesh, Yonezawa, Go, Truong, Xuan Quang, Do, Thi Hang, Tien Bui, Dieu, and Lee, Saro
Subjects: LANDSLIDES, MACHINE learning
Abstract: The objective of this research is introduce a new machine learning ensemble approach that is a hybridization of Bagging ensemble (BE) and Logistic Model Trees (LMTree), named as BE-LMtree, for improving the performance of the landslide susceptibility model. The LMTree is a relatively new machine learning algorithm that was rarely explored for landslide study, whereas BE is an ensemble framework that has proven highly efficient for landslide modeling. Upper Reaches Area of Red River Basin (URRB) in Northwest region of Viet Nam was employed as a case study. For this work, a GIS database for the URRB area has been established, which contains a total of 255 landslide polygons and eight predisposing factors i.e., slope, aspect, elevation, land cover, soil type, lithology, distance to fault, and distance to river. The database was then used to construct and validate the proposed BE-LMTree model. Quality of the final BE-LMTree model was checked using confusion matrix and a set of statistical measures. The result showed that the performance of the proposed BE-LMTree model is high with the classification accuracy is 93.81% on the training dataset and the prediction capability is 83.4% on the on the validation dataset. When compared to the support vector machine model and the LMTree model, the proposed BE-LMTree model performs better; therefore, we concluded that the BE-LMTree could prove to be a new efficient tool that should be used for landslide modeling. This research could provide useful results for landslide modeling in landslide prone areas. [ABSTRACT FROM AUTHOR]
Published: 2018
Full Text: View/download PDF

16. Landslide susceptibility modelling using different advanced decision trees methods.

Author: Thai Pham, Binh, Tien Bui, Dieu, and Prakash, Indra
Subjects: *LANDSLIDES, *SENSITIVITY (Personality trait), *DECISION trees, *LOGISTIC model (Demography), *CURVATURE
Abstract: In this paper, decision trees machine learning algorithms, namely Random Forest (RF), Alternating Decision Tree (ADT), and Logistic Model Tree (LMT), were applied for modelling of susceptibility of landslides at the Luc Yen district, Northern Vietnam. These methods were evaluated to compare the performance of models and for selection of the best model for landslide susceptibility mapping and prediction. In this study, data of 95 landslides events were analysed with 10 landslide affecting factors using the Correlation-Based Feature Selection (CFS). These factors are land use, elevation, slope, distance to roads, aspect, curvature, distance to faults, rainfall, lithology, and distance to rivers. Receiver Operating Characteristic (ROC) curve, statistical indices (sensitivity, specificity, and kappa), and Chi-square test were utilised for validating and comparing the models performance. The modelling results show that the performance of RF model (AUC = 0.839) is the best with the data at hand compared to the ADT model (0.827) and the LMT (0.809) model. The RF should be applied for the better landslide susceptibility mapping and management. [ABSTRACT FROM AUTHOR]
Published: 2018
Full Text: View/download PDF

17. Improving Accuracy Estimation of Forest Aboveground Biomass Based on Incorporation of ALOS-2 PALSAR-2 and Sentinel-2A Imagery and Machine Learning: A Case Study of the Hyrcanian Forest Area (Iran).

Author: Vafaei, Sasan, Soosani, Javad, Adeli, Kamran, Fadaei, Hadi, Naghavi, Hamed, Pham, Tien Dat, and Tien Bui, Dieu
Subjects: FORESTS & forestry, BIOMASS, MACHINE learning, MACHINE theory, DATA mining, MEAN square algorithms, LEAST squares
Abstract: The main objective of this research is to investigate the potential combination of Sentinel-2A and ALOS-2 PALSAR-2 (Advanced Land Observing Satellite -2 Phased Array type L-band Synthetic Aperture Radar-2) imagery for improving the accuracy of the Aboveground Biomass (AGB) measurement. According to the current literature, this kind of investigation has rarely been conducted. The Hyrcanian forest area (Iran) is selected as the case study. For this purpose, a total of 149 sample plots for the study area were documented through fieldwork. Using the imagery, three datasets were generated including the Sentinel-2A dataset, the ALOS-2 PALSAR-2 dataset, and the combination of the Sentinel-2A dataset and the ALOS-2 PALSAR-2 dataset (Sentinel-ALOS). Because the accuracy of the AGB estimation is dependent on the method used, in this research, four machine learning techniques were selected and compared, namely Random Forests (RF), Support Vector Regression (SVR), Multi-Layer Perceptron Neural Networks (MPL Neural Nets), and Gaussian Processes (GP). The performance of these AGB models was assessed using the coefficient of determination (R²), the root-mean-square error (RMSE), and the mean absolute error (MAE). The results showed that the AGB models derived from the combination of the Sentinel-2A and the ALOS-2 PALSAR-2 data had the highest accuracy, followed by models using the Sentinel-2A dataset and the ALOS-2 PALSAR-2 dataset. Among the four machine learning models, the SVR model (R² = 0.73, RMSE = 38.68, and MAE = 32.28) had the highest prediction accuracy, followed by the GP model (R² = 0.69, RMSE = 40.11, and MAE = 33.69), the RF model (R² = 0.62, RMSE = 43.13, and MAE = 35.83), and the MPL Neural Nets model (R² = 0.44, RMSE = 64.33, and MAE = 53.74). Overall, the Sentinel-2A imagery provides a reasonable result while the ALOS-2 PALSAR-2 imagery provides a poor result of the forest AGB estimation. The combination of the Sentinel-2A imagery and the ALOS-2 PALSAR-2 imagery improved the estimation accuracy of AGB compared to that of the Sentinel-2A imagery only. [ABSTRACT FROM AUTHOR]
Published: 2018
Full Text: View/download PDF

18. Landslide Susceptibility Assessment Using Bagging Ensemble Based Alternating Decision Trees, Logistic Regression and J48 Decision Trees Methods: A Comparative Study.

Author: Pham, Binh, Tien Bui, Dieu, and Prakash, Indra
Subjects: LANDSLIDES, BOOTSTRAP aggregation (Algorithms), LOGISTIC regression analysis, MACHINE learning
Abstract: In this study, we have evaluated and compared prediction capability of Bagging Ensemble Based Alternating Decision Trees (BADT), Logistic Regression (LR), and J48 Decision Trees (J48DT) for landslide susceptibility mapping at part of the Uttarakhand State (India). The BADT method has been proposed in the present study which is a novel hybrid machine learning ensemble approach of bagging ensemble and alternating decision trees. The J48DT is a relative new machine learning technique which has been applied only in few landslide studies, and the LR is known as a popular landslide susceptibility model. For the model studies, a spatial database of 930 historical landslide events and 15 landslide affecting factors have been collected and analyzed. This database has been used to build and validate the landslide models namely BADT, LR and J48DT Predictive capability of these models has been validated and compared using statistical analyzing methods and Receiver Operating Characteristic (ROC) curve. Results show that these three landslide models (BADT, LR and J48DT) performed well with the training dataset. However, using the validation dataset the BADT model has the highest prediction capability, followed by the LR model, and the J48DT model, respectively. This indicates that the BADT is a promising method which can be used for landslide susceptibility assessment also for other landslide prone areas. [ABSTRACT FROM AUTHOR]
Published: 2017
Full Text: View/download PDF

19. Hybrid integration of Multilayer Perceptron Neural Networks and machine learning ensembles for landslide susceptibility assessment at Himalayan area (India) using GIS.

Author: Pham, Binh Thai, Tien Bui, Dieu, Prakash, Indra, and Dholakia, M.B.
Subjects: *MULTILAYER perceptrons, *ARTIFICIAL neural networks, *MACHINE learning, *LANDSLIDES, *GEOGRAPHIC information systems
Abstract: The main objective of this study is to evaluate and compare the performance of landslide models using machine learning ensemble technique for landslide susceptibility assessment. This technique is a combination of ensemble methods (AdaBoost, Bagging, Dagging, MultiBoost, Rotation Forest, and Random SubSpace) and the base classifier of Multiple Perceptron Neural Networks (MLP Neural Nets). Ensemble techniques have been widely applied in other fields; however, their application is still rare in the assessment of landslide problems. Meanwhile, MLP Neural Nets, which is known as an artificial neural network, has been applied widely and efficiently in landslide problems. In the present study, landslide models of part Himalayan area (India) have been constructed and validated. For the evaluation and comparison of these models, receiver operating characteristic curve and Chi Square test methods have been applied. Overall, all landslide models performed well in landslide susuceptibility assessment but the performance of the MultiBoost model is the highest (AUC = 0.886), followed by Dagging model (AUC = 0.885), the Rotation Forest model (AUC = 0.882), the Bagging and Random SubSpace models (AUC = 0.881), and the AdaBoost model (AUC = 0.876), respectively. Moreover, machine learning ensemble models have improved significantly the performance of the base classifier of MLP Neural Nets (AUC = 0.874). Analysis of results indicates that landslide models using machine learning ensemble frameworks are promising methods which can be used as alternatives of individual base classifiers for landslide susceptibility assessment of other prone areas. [ABSTRACT FROM AUTHOR]
Published: 2017
Full Text: View/download PDF

20. Novel Machine Learning Approaches for Modelling the Gully Erosion Susceptibility.

Author: Arabameri, Alireza, Asadi Nalivan, Omid, Chandra Pal, Subodh, Chakrabortty, Rabin, Saha, Asish, Lee, Saro, Pradhan, Biswajeet, and Tien Bui, Dieu
Subjects: MACHINE learning, EROSION, RECEIVER operating characteristic curves, SUPPORT vector machines, WATER conservation
Abstract: The extreme form of land degradation caused by the formation of gullies is a major challenge for the sustainability of land resources. This problem is more vulnerable in the arid and semi-arid environment and associated damage to agriculture and allied economic activities. Appropriate modeling of such erosion is therefore needed with optimum accuracy for estimating vulnerable regions and taking appropriate initiatives. The Golestan Dam has faced an acute problem of gully erosion over the last decade and has adversely affected society. Here, the artificial neural network (ANN), general linear model (GLM), maximum entropy (MaxEnt), and support vector machine (SVM) machine learning algorithm with 90/10, 80/20, 70/30, 60/40, and 50/50 random partitioning of training and validation samples was selected purposively for estimating the gully erosion susceptibility. The main objective of this work was to predict the susceptible zone with the maximum possible accuracy. For this purpose, random partitioning approaches were implemented. For this purpose, 20 gully erosion conditioning factors were considered for predicting the susceptible areas by considering the multi-collinearity test. The variance inflation factor (VIF) and tolerance (TOL) limit were considered for multi-collinearity assessment for reducing the error of the models and increase the efficiency of the outcome. The ANN with 50/50 random partitioning of the sample is the most optimal model in this analysis. The area under curve (AUC) values of receiver operating characteristics (ROC) in ANN (50/50) for the training and validation data are 0.918 and 0.868, respectively. The importance of the causative factors was estimated with the help of the Jackknife test, which reveals that the most important factor is the topography position index (TPI). Apart from this, the prioritization of all predicted models was estimated taking into account the training and validation data set, which should help future researchers to select models from this perspective. This type of outcome should help planners and local stakeholders to implement appropriate land and water conservation measures. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

21. A New Modeling Approach for Spatial Prediction of Flash Flood with Biogeography Optimized CHAID Tree Ensemble and Remote Sensing Data.

Author: Nguyen, Viet-Nghia, Yariyan, Peyman, Amiri, Mahdis, Dang Tran, An, Pham, Tien Dat, Do, Minh Phuong, Thi Ngo, Phuong Thao, Nhu, Viet-Ha, Quoc Long, Nguyen, and Tien Bui, Dieu
Subjects: GEOSPATIAL data, FORECASTING, REMOTE sensing, SYNTHETIC aperture radar, FLOODS, BIOGEOGRAPHY
Abstract: Flash floods induced by torrential rainfalls are considered one of the most dangerous natural hazards, due to their sudden occurrence and high magnitudes, which may cause huge damage to people and properties. This study proposed a novel modeling approach for spatial prediction of flash floods based on the tree intelligence-based CHAID (Chi-square Automatic Interaction Detector)random subspace, optimized by biogeography-based optimization (the CHAID-RS-BBO model), using remote sensing and geospatial data. In this proposed approach, a forest of tree intelligence was constructed through the random subspace ensemble, and, then, the swarm intelligence was employed to train and optimize the model. The Luc Yen district, located in the northwest mountainous area of Vietnam, was selected as a case study. For this circumstance, a flood inventory map with 1866 polygons for the district was prepared based on Sentinel-1 synthetic aperture radar (SAR) imagery and field surveys with handheld GPS. Then, a geospatial database with ten influencing variables (land use/land cover, soil type, lithology, river density, rainfall, topographic wetness index, elevation, slope, curvature, and aspect) was prepared. Using the inventory map and the ten explanatory variables, the CHAID-RS-BBO model was trained and verified. Various statistical metrics were used to assess the prediction capability of the proposed model. The results show that the proposed CHAID-RS-BBO model yielded the highest predictive performance, with an overall accuracy of 90% in predicting flash floods, and outperformed benchmarks (i.e., the CHAID, the J48-DT, the logistic regression, and the multilayer perception neural network (MLP-NN) models). We conclude that the proposed method can accurately estimate the spatial prediction of flash floods in tropical storm areas. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

22. Landslide Susceptibility Evaluation and Management Using Different Machine Learning Methods in The Gallicash River Watershed, Iran.

Author: Arabameri, Alireza, Saha, Sunil, Roy, Jagabandhu, Chen, Wei, Blaschke, Thomas, and Tien Bui, Dieu
Subjects: LANDSLIDE hazard analysis, FISHER discriminant analysis, MACHINE learning, LANDSLIDES, GEOGRAPHIC information systems, RECEIVER operating characteristic curves
Abstract: This analysis aims to generate landslide susceptibility maps (LSMs) using various machine learning methods, namely random forest (RF), alternative decision tree (ADTree) and Fisher's Linear Discriminant Function (FLDA). The results of the FLDA, RF and ADTree models were compared with regard to their applicability for creating an LSM of the Gallicash river watershed in the northern part of Iran close to the Caspian Sea. A landslide inventory map was created using GPS points obtained in a field analysis, high-resolution satellite images, topographic maps and historical records. A total of 249 landslide sites have been identified to date and were used in this study to model and validate the LSMs of the study region. Of the 249 landslide locations, 70% were used as training data and 30% for the validation of the resulting LSMs. Sixteen factors related to topographical, hydrological, soil type, geological and environmental conditions were used and a multi-collinearity test of the landslide conditioning factors (LCFs) was performed. Using the natural break method (NBM) in a geographic information system (GIS), the LSMs generated by the RF, FLDA, and ADTree models were categorized into five classes, namely very low, low, medium, high and very high landslide susceptibility (LS) zones. The very high susceptibility zones cover 15.37% (ADTree), 16.10% (FLDA) and 11.36% (RF) of the total catchment area. The results of the different models (FLDA, RF, and ADTree) were explained and compared using the area under receiver operating characteristics (AUROC) curve, seed cell area index (SCAI), efficiency and true skill statistic (TSS). The accuracy of models was calculated considering both the training and validation data. The results revealed that the AUROC success rates are 0.89 (ADTree), 0.92 (FLDA) and 0.97 (RF) and predication rates are 0.82 (ADTree), 0.79 (FLDA) and 0.98 (RF), which justifies the approach and indicates a reasonably good landslide prediction. The results of the SCAI, efficiency and TSS methods showed that all models have an excellent modeling capability. In a comparison of the models, the RF model outperforms the boosted regression tree (BRT) and ADTree models. The results of the landslide susceptibility modeling could be useful for land-use planning and decision-makers, for managing and controlling the current and future landslides, as well as for the protection of society and the ecosystem. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

23. Advanced Machine Learning and Big Data Analytics in Remote Sensing for Natural Hazards Management.

Author: Martínez-Álvarez, Francisco and Tien Bui, Dieu
Subjects: *REMOTE sensing, *BIG data, *MACHINE learning, *SOIL salinity, *WILDFIRES
Abstract: This editorial summarizes the performance of the special issue entitled Advanced Machine Learning and Big Data Analytics in Remote Sensing for Natural Hazards Management, which was published at MDPI's Remote Sensing journal. The special issue took place in years 2018 and 2019 and accepted a total of nine papers from authors of thirteen different countries. So far, these papers have dealt with 116 cites. Earthquakes, landslides, floods, wildfire and soil salinity were the topics analyzed. New methods were introduced, with applications of the utmost relevance. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

24. Hybrid Computational Intelligence Models for Improvement Gully Erosion Assessment.

Author: Arabameri, Alireza, Chen, Wei, Lombardo, Luigi, Blaschke, Thomas, and Tien Bui, Dieu
Subjects: FISHER discriminant analysis, COMPUTATIONAL intelligence, RECEIVER operating characteristic curves, EROSION, SOIL erosion
Abstract: Gullying is a type of soil erosion that currently represents a major threat at the societal scale and will likely increase in the future. In Iran, soil erosion, and specifically gullying, is already causing significant distress to local economies by affecting agricultural productivity and infrastructure. Recognizing this threat has recently led the Iranian geomorphology community to focus on the problem across the whole country. This study is in line with other efforts where the optimal method to map gully-prone areas is sought by testing state-of-the-art machine learning tools. In this study, we compare the performance of three machine learning algorithms, namely Fisher's linear discriminant analysis (FLDA), logistic model tree (LMT) and naïve Bayes tree (NBTree). We also introduce three novel ensemble models by combining the aforementioned base classifiers to the Random SubSpace (RS) meta-classifier namely RS-FLDA, RS-LMT and RS-NBTree. The area under the receiver operating characteristic (AUROC), true skill statistics (TSS) and kappa criteria are used for calibration (goodness-of-fit) and validation (prediction accuracy) datasets to compare the performance of the different algorithms. In addition to susceptibility mapping, we also study the association between gully erosion and a set of morphometric, hydrologic and thematic properties by adopting the evidential belief function (EBF). The results indicate that hydrology-related factors contribute the most to gully formation, which is also confirmed by the susceptibility patterns displayed by the RS-NBTree ensemble. The RS-NBTree is the model that outperforms the other five models, as indicated by the prediction accuracy (area under curve (AUC) = 0.898, Kappa = 0.748 and TSS = 0.697), and goodness-of-fit (AUC = 0.780, Kappa = 0.682 and TSS = 0.618). The analyses are performed with the same gully presence/absence balanced modeling design. Therefore, the differences in performance are dependent on the algorithm architecture. Overall, the EBF model can detect strong and reasonable dependencies towards gully-prone conditions. The RS-NBTree ensemble model performed significantly better than the others, suggesting greater flexibility towards unknown data, which may support the applications of these methods in transferable susceptibility models in areas that are potentially erodible but currently lack gully data. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

25. A Comparative Study of Kernel Logistic Regression, Radial Basis Function Classifier, Multinomial Naïve Bayes, and Logistic Model Tree for Flash Flood Susceptibility Mapping.

Author: Pham, Binh Thai, Phong, Tran Van, Nguyen, Huu Duy, Qi, Chongchong, Al-Ansari, Nadhir, Amini, Ata, Ho, Lanh Si, Tuyen, Tran Thi, Yen, Hoang Phan Hai, Ly, Hai-Bang, Prakash, Indra, and Tien Bui, Dieu
Subjects: RADIAL basis functions, LANDSLIDE hazard analysis, LOGISTIC regression analysis, FLOOD risk, RECEIVER operating characteristic curves, FLOODS, COMPARATIVE studies
Abstract: Risk of flash floods is currently an important problem in many parts of Vietnam. In this study, we used four machine-learning methods, namely Kernel Logistic Regression (KLR), Radial Basis Function Classifier (RBFC), Multinomial Naïve Bayes (NBM), and Logistic Model Tree (LMT) to generate flash flood susceptibility maps at the minor part of Nghe An province of the Center region (Vietnam) where recurrent flood problems are being experienced. Performance of these four methods was evaluated to select the best method for flash flood susceptibility mapping. In the model studies, ten flash flood conditioning factors, namely soil, slope, curvature, river density, flow direction, distance from rivers, elevation, aspect, land use, and geology, were chosen based on topography and geo-environmental conditions of the site. For the validation of models, the area under Receiver Operating Characteristic (ROC), Area Under Curve (AUC), and various statistical indices were used. The results indicated that performance of all the models is good for generating flash flood susceptibility maps (AUC = 0.983–0.988). However, performance of LMT model is the best among the four methods (LMT: AUC = 0.988; KLR: AUC = 0.985; RBFC: AUC = 0.984; and NBM: AUC = 0.983). The present study would be useful for the construction of accurate flash flood susceptibility maps with the objectives of identifying flood-susceptible areas/zones for proper flash flood risk management. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

26. Gully Head-Cut Distribution Modeling Using Machine Learning Methods—A Case Study of N.W. Iran.

Author: Arabameri, Alireza, Chen, Wei, Blaschke, Thomas, Tiefenbacher, John P., Pradhan, Biswajeet, and Tien Bui, Dieu
Subjects: PRECIPITATION forecasting, DECISION trees, ARID regions, MACHINE learning, LOGISTIC regression analysis, SOIL erosion, WATERSHED management
Abstract: To more effectively prevent and manage the scourge of gully erosion in arid and semi-arid regions, we present a novel-ensemble intelligence approach—bagging-based alternating decision-tree classifier (bagging-ADTree)—and use it to model a landscape's susceptibility to gully erosion based on 18 gully-erosion conditioning factors. The model's goodness-of-fit and prediction performance are compared to three other machine learning algorithms (single alternating decision tree, rotational-forest-based alternating decision tree (RF-ADTree), and benchmark logistic regression). To achieve this, a gully-erosion inventory was created for the study area, the Chah Mousi watershed, Iran by combining archival records containing reports of gully erosion, remotely sensed data from Google Earth, and geolocated sites of gully head-cuts gathered in a field survey. A total of 119 gully head-cuts were identified and mapped. To train the models' analysis and prediction capabilities, 83 head-cuts (70% of the total) and the corresponding measures of the conditioning factors were input into each model. The results from the models were validated using the data pertaining to the remaining 36 gully locations (30%). Next, the frequency ratio is used to identify which conditioning-factor classes have the strongest correlation with gully erosion. Using random-forest modeling, the relative importance of each of the conditioning factors was determined. Based on the random-forest results, the top eight factors in this study area are distance-to-road, drainage density, distance-to-stream, LU/LC, annual precipitation, topographic wetness index, NDVI, and elevation. Finally, based on goodness-of-fit and AUROC of the success rate curve (SRC) and prediction rate curve (PRC), the results indicate that the bagging-ADTree ensemble model had the best performance, with SRC (0.964) and PRC (0.978). RF-ADTree (SRC = 0.952 and PRC = 0.971), ADTree (SRC = 0.926 and PRC = 0.965), and LR (SRC = 0.867 and PRC = 0.870) were the subsequent best performers. The results also indicate that bagging and RF, as meta-classifiers, improved the performance of the ADTree model as a base classifier. The bagging-ADTree model's results indicate that 24.28% of the study area is classified as having high and very high susceptibility to gully erosion. The new ensemble model accurately identified the areas that are susceptible to gully erosion based on the past patterns of formation, but it also provides highly accurate predictions of future gully development. The novel ensemble method introduced in this research is recommended for use to evaluate the patterns of gullying in arid and semi-arid environments and can effectively identify the most salient conditioning factors that promote the development and expansion of gullies in erosion-susceptible environments. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

27. Application of Probabilistic and Machine Learning Models for Groundwater Potentiality Mapping in Damghan Sedimentary Plain, Iran.

Author: Arabameri, Alireza, Roy, Jagabandhu, Saha, Sunil, Blaschke, Thomas, Ghorbanzadeh, Omid, and Tien Bui, Dieu
Subjects: GROUNDWATER, PROBABILISTIC databases, STANDARD deviations, MACHINE learning, GLOBAL Positioning System, RECEIVER operating characteristic curves
Abstract: Groundwater is one of the most important natural resources, as it regulates the earth's hydrological system. The Damghan sedimentary plain area, located in the region of a semi-arid climate of Iran, has very critical conditions of groundwater due to massive pressure on it and is in need of robust models for identifying the groundwater potential zones (GWPZ). The main goal of the current research is to prepare a groundwater potentiality map (GWPM) considering the probabilistic, machine learning, data mining, and multi-criteria decision analysis (MCDA) approaches. For this purpose, 80 wells collected from the Iranian groundwater resource department and field investigation with global positioning system (GPS), have been selected randomly and considered as the groundwater inventory datasets. Out of 80 wells, 56 (70%) wells have been brought into play for modeling and 24 (30%) for validation purposes. Elevation, slope, aspect, convergence index (CI), rainfall, drainage density (Dd), distance to river, distance to fault, distance to road, lithology, soil type, land use/land cover (LU/LC), normalized difference vegetation index (NDVI), topographic wetness index (TWI), topographic position index (TPI), and stream power index (SPI) have been used for modeling purpose. The area under the receiver operating characteristic (AUROC), sensitivity (SE), specificity (SP), accuracy (AC), mean absolute error (MAE), and root mean square error (RMSE) are used for checking the goodness-of-fit and prediction accuracy of approaches to compare their performance. In addition, the influence of groundwater determining factors (GWDFs) on groundwater occurrence was evaluated by performing a sensitivity analysis model. The GWPMs, produced by technique for order preference by similarity to ideal solution (TOPSIS), random forest (RF), binary logistic regression (BLR), weight of evidence (WoE) and support vector machine (SVM) have been classified into four categories, i.e., low, medium, high and very high groundwater potentiality with the help of the natural break classification methods in the GIS environment. The very high groundwater potentiality class is covered 15.09% for TOPSIS, 15.46% for WoE, 25.26% for RF, 15.47% for BLR, and 18.74% for SVM of the entire plain area. Based on sensitivity analysis, distance from river, and drainage density represent significantly effects on the groundwater occurrence. validation results show that the BLR model with best prediction accuracy and goodness-of-fit outperforms the other five models. Although, all models have very good performance in modeling of groundwater potential. Results of seed cell area index model that used for checking accuracy classification of models show that all models have suitable performance. Therefore, these are promising models that can be applied for the GWPZs identification, which will help for some needful action of these areas. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

28. Spatial prediction of shallow landslide using Bat algorithm optimized machine learning approach: A case study in Lang Son Province, Vietnam.

Author: Tien Bui, Dieu, Hoang, Nhat-Duc, Nguyen, Hieu, and Tran, Xuan-Linh
Subjects: *LANDSLIDES, *LANDSLIDE prediction, *ARTIFICIAL neural networks, *GEOGRAPHIC information systems, *MACHINE learning, *GEODATABASES
Abstract: • Proposed a hybrid method for landslide susceptibility mapping. • LSSVC is employed for pattern recognition. • Bat Algorithm is used to optimize the model performance. • A GIS database in Lang Son province (Vietnam) is employed. • The hybrid model has a good prediction performance (CAR = 90.44%). This study develops a machine learning method that hybridizes the Least Squares Support Vector Classification (LSSVC) and Bat Algorithm (BA), named as BA-LSSVC, for spatial prediction of shallow landslide. To construct and verify the hybrid method, a Geographic Information System (GIS) database for the study area of Lang Son province (Vietnam) has been employed. LSSVC is used to separate data samples in the GIS database into two categories of non-landslide (negative class) and landslide (positive class). The BA metaheuristic is employed to assist the LSSVC model selection process by fine-tuning its hyper-parameters: the regularization coefficient and the kernel function parameter. Experimental results point out that the hybrid BA-LSSVC can help to achieve a desired prediction with an accuracy rate of more than 90%. The performance of BA-LSSVC is also better than those of benchmark methods, including the Convolutional Neural Network, Relevance Vector Machine, Artificial Neural Network, and Logistic Regression. Hence, the newly developed model is a capable tool to assist local authority in landslide hazard mitigation and management. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

29. The Feasibility of Three Prediction Techniques of the Artificial Neural Network, Adaptive Neuro-Fuzzy Inference System, and Hybrid Particle Swarm Optimization for Assessing the Safety Factor of Cohesive Slopes.

Author: Moayedi, Hossein, Tien Bui, Dieu, Gör, Mesut, Pradhan, Biswajeet, and Jaafari, Abolfazl
Subjects: *PARTICLE swarm optimization, *ARTIFICIAL neural networks, *SAFETY factor in engineering, *STANDARD deviations, *SLOPE stability, *FINITE element method
Abstract: In this paper, a neuro particle-based optimization of the artificial neural network (ANN) is investigated for slope stability calculation. The results are also compared to another artificial intelligence technique of a conventional ANN and adaptive neuro-fuzzy inference system (ANFIS) training solutions. The database used with 504 training datasets (e.g., a range of 80%) and testing dataset consists of 126 items (e.g., 20% of the whole dataset). Moreover, variables of the ANN method (for example, nodes number for each hidden layer) and the algorithm of PSO-like swarm size and inertia weight are improved by utilizing a total of 28 (i.e., for the PSO-ANN) trial and error approaches. The key properties were fed as input, which were utilized via the analysis of OptumG2 finite element modelling (FEM), containing undrained cohesion stability of the baseline soil (Cu), angle of the original slope (β), and setback distance ratio (b/B) where the target is selected factor of safety. The estimated data for datasets of ANN, ANFIS, and PSO-ANN models were examined based on three determined statistical indexes. Namely, root mean square error (RMSE) and the coefficient of determination (R2). After accomplishing the analysis of sensitivity, considering 72 trials and errors of the neurons number, the optimized architecture of 4 × 6 × 1 was determined to the structure of the ANN model. As an outcome, the employed methods presented excellent efficiency, but based on the ranking method, the PSO-ANN approach might have slightly better efficiency in comparison to the algorithms of ANN and ANFIS. According to statistics, for the proper structure of PSO-ANN, the indexes of R2 and RMSE values of 0.9996, and 0.0123, as well as 0.9994 and 0.0157, were calculated for the training and testing networks. Nevertheless, having the ANN model with six neurons for each hidden layer was formulized for further practical use. This study demonstrates the efficiency of the proposed neuro model of PSO-ANN in estimating the factor of safety compared to other conventional techniques. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

30. Predicting Slope Stability Failure through Machine Learning Paradigms.

Author: Tien Bui, Dieu, Moayedi, Hossein, Gör, Mesut, Jaafari, Abolfazl, and Foong, Loke Kok
Subjects: *SLOPE stability, *SAFETY factor in engineering, *MACHINE learning, *KRIGING, *ENGINEERING design, *REGRESSION analysis, *PREDICTIVE control systems, *EARTH dams
Abstract: In this study, we employed various machine learning-based techniques in predicting factor of safety against slope failures. Different regression methods namely, multi-layer perceptron (MLP), Gaussian process regression (GPR), multiple linear regression (MLR), simple linear regression (SLR), support vector regression (SVR) were used. Traditional methods of slope analysis (e.g., first established in the first half of the twentieth century) used widely as engineering design tools. Offering more progressive design tools, such as machine learning-based predictive algorithms, they draw the attention of many researchers. The main objective of the current study is to evaluate and optimize various machine learning-based and multilinear regression models predicting the safety factor. To prepare training and testing datasets for the predictive models, 630 finite limit equilibrium analysis modelling (i.e., a database including 504 training datasets and 126 testing datasets) were employed on a single-layered cohesive soil layer. The estimated results for the presented database from GPR, MLR, MLP, SLR, and SVR were assessed by various methods. Firstly, the efficiency of applied models was calculated employing various statistical indices. As a result, obtained total scores 20, 35, 50, 10, and 35, respectively for GPR, MLR, MLP, SLR, and SVR, revealed that the MLP outperformed other machine learning-based models. In addition, SVR and MLR presented an almost equal accuracy in estimation, for both training and testing phases. Note that, an acceptable degree of efficiency was obtained for GPR and SLR models. However, GPR showed more precision. Following this, the equation of applied MLP and MLR models (i.e., in their optimal condition) was derived, due to the reliability of their results, to be used in similar slope stability problems. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

31. Prediction of Pullout Behavior of Belled Piles through Various Machine Learning Modelling Techniques.

Author: Tien Bui, Dieu, Moayedi, Hossein, Abdullahi, Mu'azu Mohammed, Safuan A Rashid, Ahmad, and Nguyen, Hoang
Abstract: The main goal of this study is to estimate the pullout forces by developing various modelling technique like feedforward neural network (FFNN), radial basis functions neural networks (RBNN), general regression neural network (GRNN) and adaptive neuro-fuzzy inference system (ANFIS). A hybrid learning algorithm, including a back-propagation and least square estimation, is utilized to train ANFIS in MATLAB (software). Accordingly, 432 samples have been applied, through which 300 samples have been considered as training dataset with 132 ones for testing dataset. All results have been analyzed by ANFIS, in which the reliability has been confirmed through the comparing of the results. Consequently, regarding FFNN, RBNN, GRNN, and ANFIS, statistical indexes of coefficient of determination (R2), variance account for (VAF) and root mean square error (RMSE) in the values of (0.957, 0.968, 0.939, 0.902, 0.998), (95.677, 96.814, 93.884, 90.131, 97.442) and (2.176, 1.608, 3.001, 4.39, 0.058) have been achieved for training datasets and the values of (0.951, 0.913, 0.729, 0.685 and 0.995), (95.04, 91.13, 72.745, 66.228, 96.247) and (2.433, 4.032, 8.005, 10.188 and 1.252) are for testing datasets indicating a satisfied reliability of ANFIS in estimating of pullout behavior of belled piles. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

32. New Ensemble Models for Shallow Landslide Susceptibility Modeling in a Semi-Arid Watershed.

Author: Tien Bui, Dieu, Shirzadi, Ataollah, Shahabi, Himan, Geertsema, Marten, Omidvar, Ebrahim, Clague, John J., Thai Pham, Binh, Dou, Jie, Talebpour Asl, Dawood, Bin Ahmad, Baharin, and Lee, Saro
Subjects: LANDSLIDES, STANDARD deviations, SUPPORT vector machines, WATERSHEDS
Abstract: We prepared a landslide susceptibility map for the Sarkhoon watershed, Chaharmahal-w-bakhtiari, Iran, using novel ensemble artificial intelligence approaches. A classifier of support vector machine (SVM) was employed as a base classifier, and four Meta/ensemble classifiers, including Adaboost (AB), bagging (BA), rotation forest (RF), and random subspace (RS), were used to construct new ensemble models. SVM has been used previously to spatially predict landslides, but not together with its ensembles. We selected 20 conditioning factors and randomly portioned 98 landslide locations into training (70%) and validating (30%) groups. Several statistical metrics, including sensitivity, specificity, accuracy, kappa, root mean square error (RMSE), and area under the receiver operatic characteristic curve (AUC), were used for model comparison and validation. Using the One-R Attribute Evaluation (ORAE) technique, we found that all 20 conditioning factors were significant in identifying landslide locations, but "distance to road" was found to be the most important. The RS (AUC = 0.837) and RF (AUC = 0.834) significantly improved the goodness-of-fit and prediction accuracy of the SVM (AUC = 0.810), whereas the BA (AUC = 0.807) and AB (AUC = 0.779) did not. The random subspace based support vector machine (RSSVM) model is a promising technique for helping to better manage land in landslide-prone areas. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

33. Multi-Hazard Exposure Mapping Using Machine Learning Techniques: A Case Study from Iran.

Author: Rahmati, Omid, Yousefi, Saleh, Kalantari, Zahra, Uuemaa, Evelyn, Teimurian, Teimur, Keesstra, Saskia, Pham, Tien Dat, and Tien Bui, Dieu
Subjects: RECEIVER operating characteristic curves, MACHINE learning, SUPPORT vector machines, FLOOD warning systems, STATISTICAL reliability, REGRESSION trees
Abstract: Mountainous areas are highly prone to a variety of nature-triggered disasters, which often cause disabling harm, death, destruction, and damage. In this work, an attempt was made to develop an accurate multi-hazard exposure map for a mountainous area (Asara watershed, Iran), based on state-of-the art machine learning techniques. Hazard modeling for avalanches, rockfalls, and floods was performed using three state-of-the-art models—support vector machine (SVM), boosted regression tree (BRT), and generalized additive model (GAM). Topo-hydrological and geo-environmental factors were used as predictors in the models. A flood dataset (n = 133 flood events) was applied, which had been prepared using Sentinel-1-based processing and ground-based information. In addition, snow avalanche (n = 58) and rockfall (n = 101) data sets were used. The data set of each hazard type was randomly divided to two groups: Training (70%) and validation (30%). Model performance was evaluated by the true skill score (TSS) and the area under receiver operating characteristic curve (AUC) criteria. Using an exposure map, the multi-hazard map was converted into a multi-hazard exposure map. According to both validation methods, the SVM model showed the highest accuracy for avalanches (AUC = 92.4%, TSS = 0.72) and rockfalls (AUC = 93.7%, TSS = 0.81), while BRT demonstrated the best performance for flood hazards (AUC = 94.2%, TSS = 0.80). Overall, multi-hazard exposure modeling revealed that valleys and areas close to the Chalous Road, one of the most important roads in Iran, were associated with high and very high levels of risk. The proposed multi-hazard exposure framework can be helpful in supporting decision making on mountain social-ecological systems facing multiple hazards. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

34. A Novel Ensemble Artificial Intelligence Approach for Gully Erosion Mapping in a Semi-Arid Watershed (Iran).

Author: Tien Bui, Dieu, Shirzadi, Ataollah, Shahabi, Himan, Chapi, Kamran, Omidavr, Ebrahim, Pham, Binh Thai, Talebpour Asl, Dawood, Khaledian, Hossein, Pradhan, Biswajeet, Panahi, Mahdi, Bin Ahmad, Baharin, Rahmani, Hosein, Gróf, Gyula, and Lee, Saro
Subjects: *ARTIFICIAL intelligence, *EROSION, *SUPPORT vector machines, *GEOMORPHOLOGY, *WATERSHEDS, *KERNEL functions
Abstract: In this study, we introduced a novel hybrid artificial intelligence approach of rotation forest (RF) as a Meta/ensemble classifier based on alternating decision tree (ADTree) as a base classifier called RF-ADTree in order to spatially predict gully erosion at Klocheh watershed of Kurdistan province, Iran. A total of 915 gully erosion locations along with 22 gully conditioning factors were used to construct a database. Some soft computing benchmark models (SCBM) including the ADTree, the Support Vector Machine by two kernel functions such as Polynomial and Radial Base Function (SVM-Polynomial and SVM-RBF), the Logistic Regression (LR), and the Naïve Bayes Multinomial Updatable (NBMU) models were used for comparison of the designed model. Results indicated that 19 conditioning factors were effective among which distance to river, geomorphology, land use, hydrological group, lithology and slope angle were the most remarkable factors for gully modeling process. Additionally, results of modeling concluded the RF-ADTree ensemble model could significantly improve (area under the curve (AUC) = 0.906) the prediction accuracy of the ADTree model (AUC = 0.882). The new proposed model had also the highest performance (AUC = 0.913) in comparison to the SVM-Polynomial model (AUC = 0.879), the SVM-RBF model (AUC = 0.867), the LR model (AUC = 0.75), the ADTree model (AUC = 0.861) and the NBMU model (AUC = 0.811). [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

35. An Automated Python Language-Based Tool for Creating Absence Samples in Groundwater Potential Mapping.

Author: Rahmati, Omid, Moghaddam, Davoud Davoudi, Moosavi, Vahid, Kalantari, Zahra, Samadi, Mahmood, Lee, Saro, and Tien Bui, Dieu
Subjects: PYTHON programming language, GROUNDWATER, STATISTICAL sampling, GEOGRAPHIC information systems, WATERSHEDS, MACHINE learning
Abstract: Although sampling strategy plays an important role in groundwater potential mapping and significantly influences model accuracy, researchers often apply a simple random sampling method to determine absence (non-occurrence) samples. In this study, an automated, user-friendly geographic information system (GIS)-based tool, selection of absence samples (SAS), was developed using the Python programming language. The SAS tool takes into account different geospatial concepts, including nearest neighbor (NN) and hotspot analyses. In a case study, it was successfully applied to the Bojnourd watershed, Iran, together with two machine learning models (random forest (RF) and multivariate adaptive regression splines (MARS)) with GIS and remotely sensed data, to model groundwater potential. Different evaluation criteria (area under the receiver operating characteristic curve (AUC-ROC), true skill statistic (TSS), efficiency (E), false positive rate (FPR), true positive rate (TPR), true negative rate (TNR), and false negative rate (FNR)) were used to scrutinize model performance. Two absence sample types were produced, based on a simple random method and the SAS tool, and used in the models. The results demonstrated that both RF (AUC-ROC = 0.913, TSS = 0.72, E = 0.926) and MARS (AUC-ROC = 0.889, TSS = 0.705, E = 0.90) performed better when using absence samples generated by the SAS tool, indicating that this tool is capable of producing trustworthy absence samples to improve groundwater potential models. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

36. Shallow Landslide Prediction Using a Novel Hybrid Functional Machine Learning Algorithm.

Author: Tien Bui, Dieu, Shahabi, Himan, Omidvar, Ebrahim, Shirzadi, Ataollah, Geertsema, Marten, Clague, John J., Khosravi, Khabat, Pradhan, Biswajeet, Pham, Binh Thai, Chapi, Kamran, Barati, Zahra, Bin Ahmad, Baharin, Rahmani, Hosein, Gróf, Gyula, and Lee, Saro
Subjects: *MACHINE learning, *LANDSLIDE hazard analysis, *LOGISTIC regression analysis, *LANDSLIDE prediction, *SUPPORT vector machines
Abstract: We used a novel hybrid functional machine learning algorithm to predict the spatial distribution of landslides in the Sarkhoon watershed, Iran. We developed a new ensemble model which is a combination of a functional algorithm, stochastic gradient descent (SGD) and an AdaBoost (AB) Meta classifier namely ABSGD model to predict the landslides. The model incorporates 20 landslide conditioning factors, which we ranked using the least-square support vector machine (LSSVM) technique. For the modeling, we considered 98 landslide locations, of which 70% (79) were used for training and 30% (19) for validation processes. Model validation was performed using sensitivity, specificity, accuracy, the root mean square error (RMSE) and the area under the receiver operatic characteristic (AUC) curve. We also used soft computing benchmark models, including SGD, logistic regression (LR), logistic model tree (LMT) and functional tree (FT) algorithms for model validation and comparison. The selected conditioning factors were significant in landslide occurrence but distance to road was found to be the most important factor. The ABSGD model (AUC= 0.860) outperformed the LR (0.797), SGD (0.776), LMT (0.740) and FT (0.734) models. Our results confirm that the combined use of a functional algorithm and a Meta classifier prevents over-fitting, reduces noise and enhances the power prediction of the individual SGD algorithm for the spatial prediction of landslides. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

37. Hybrid Machine Learning Approaches for Landslide Susceptibility Modeling.

Author: Nguyen, Vu Viet, Pham, Binh Thai, Vu, Ba Thao, Prakash, Indra, Jha, Sudan, Shahabi, Himan, Shirzadi, Ataollah, Ba, Dong Nguyen, Kumar, Raghvendra, Chatterjee, Jyotir Moy, and Tien Bui, Dieu
Subjects: LANDSLIDES, LANDSLIDE hazard analysis, LANDSLIDE prediction, MACHINE learning, PARTICLE swarm optimization
Abstract: This paper presents novel hybrid machine learning models, namely Adaptive Neuro Fuzzy Inference System optimized by Particle Swarm Optimization (PSOANFIS), Artificial Neural Networks optimized by Particle Swarm Optimization (PSOANN), and Best First Decision Trees based Rotation Forest (RFBFDT), for landslide spatial prediction. Landslide modeling of the study area of Van Chan district, Yen Bai province (Vietnam) was carried out with the help of a spatial database of the area, considering past landslides and 12 landslide conditioning factors. The proposed models were validated using different methods such as Area under the Receiver Operating Characteristics (ROC) curve (AUC), Mean Square Error (MSE), Root Mean Square Error (RMSE). Results indicate that the RFBFDT (AUC = 0.826, MSE = 0.189, and RMSE = 0.434) is the best method in comparison to other hybrid models, namely PSOANFIS (AUC = 0.76, MSE = 0.225, and RMSE = 0.474) and PSOANN (AUC = 0.72, MSE = 0.312, and RMSE = 0.558). Thus, it is reasonably concluded that the RFBFDT is a promising hybrid machine learning approach for landslide susceptibility modeling. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

38. Soil Salinity Mapping Using SAR Sentinel-1 Data and Advanced Machine Learning Algorithms: A Case Study at Ben Tre Province of the Mekong River Delta (Vietnam).

Author: Hoa, Pham Viet, Giang, Nguyen Vu, Binh, Nguyen An, Hai, Le Vu Hong, Pham, Tien-Dat, Hasanlou, Mahdi, and Tien Bui, Dieu
Subjects: SOIL salinity, MACHINE learning, GAUSSIAN processes, SUPPORT vector machines, SOIL moisture, REGRESSION analysis
Abstract: Soil salinity caused by climate change associated with rising sea level is considered as one of the most severe natural hazards that has a negative effect on agricultural activities in the coastal areas in most tropical climates. This issue has become more severe and increasingly occurred in the Mekong River Delta of Vietnam. The main objective of this work is to map soil salinity intrusion in Ben Tre province located on the Mekong River Delta of Vietnam using the Sentinel-1 Synthetic Aperture Radar (SAR) C-band data combined with five state-of-the-art machine learning models, Multilayer Perceptron Neural Networks (MLP-NN), Radial Basis Function Neural Networks (RBF-NN), Gaussian Processes (GP), Support Vector Regression (SVR), and Random Forests (RF). For this purpose, 63 soil samples were collected during the field survey conducted from 4–6 April 2018 corresponding to the Sentinel-1 SAR imagery. The performance of the five models was assessed and compared using the root-mean-square error (RMSE), the mean absolute error (MAE), and the correlation coefficient (r). The results revealed that the GP model yielded the highest prediction performance (RMSE = 2.885, MAE = 1.897, and r = 0.808) and outperformed the other machine learning models. We conclude that the advanced machine learning models can be used for mapping soil salinity in the Delta areas; thus, providing a useful tool for assisting farmers and the policy maker in choosing better crop types in the context of climate change. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

39. Novel GIS Based Machine Learning Algorithms for Shallow Landslide Susceptibility Mapping.

Author: Shirzadi, Ataollah, Soliamani, Karim, Habibnejhad, Mahmood, Kavian, Ataollah, Chapi, Kamran, Shahabi, Himan, Chen, Wei, Khosravi, Khabat, Thai Pham, Binh, Pradhan, Biswajeet, Ahmad, Anuar, Bin Ahmad, Baharin, and Tien Bui, Dieu
Subjects: GEOGRAPHIC information systems, MACHINE learning, ALGORITHMS, LANDSLIDES, RECEIVER operating characteristic curves, DECISION trees
Abstract: The main objective of this research was to introduce a novel machine learning algorithm of alternating decision tree (ADTree) based on the multiboost (MB), bagging (BA), rotation forest (RF) and random subspace (RS) ensemble algorithms under two scenarios of different sample sizes and raster resolutions for spatial prediction of shallow landslides around Bijar City, Kurdistan Province, Iran. The evaluation of modeling process was checked by some statistical measures and area under the receiver operating characteristic curve (AUROC). Results show that, for combination of sample sizes of 60%/40% and 70%/30% with a raster resolution of 10 m, the RS model, while, for 80%/20% and 90%/10% with a raster resolution of 20 m, the MB model obtained a high goodness-of-fit and prediction accuracy. The RS-ADTree and MB-ADTree ensemble models outperformed the ADTree model in two scenarios. Overall, MB-ADTree in sample size of 80%/20% with a resolution of 20 m (area under the curve (AUC) = 0.942) and sample size of 60%/40% with a resolution of 10 m (AUC = 0.845) had the highest and lowest prediction accuracy, respectively. The findings confirm that the newly proposed models are very promising alternative tools to assist planners and decision makers in the task of managing landslide prone areas. [ABSTRACT FROM AUTHOR]
Published: 2018
Full Text: View/download PDF

40. A Novel Hybrid Swarm Optimized Multilayer Neural Network for Spatial Prediction of Flash Floods in Tropical Areas Using Sentinel-1 SAR Imagery and Geospatial Data.

Author: Ngo, Phuong-Thao Thi, Hoang, Nhat-Duc, Pradhan, Biswajeet, Nguyen, Quang Khanh, Tran, Xuan Truong, Nguyen, Quang Minh, Nguyen, Viet Nghia, Samui, Pijush, and Tien Bui, Dieu
Subjects: SYNTHETIC aperture radar, ALGORITHMS, PARTICLE swarm optimization, ARTIFICIAL neural networks, BACK propagation, MACHINE learning, GEOSPATIAL data
Abstract: Flash floods are widely recognized as one of the most devastating natural hazards in the world, therefore prediction of flash flood-prone areas is crucial for public safety and emergency management. This research proposes a new methodology for spatial prediction of flash floods based on Sentinel-1 SAR imagery and a new hybrid machine learning technique. The SAR imagery is used to detect flash flood inundation areas, whereas the new machine learning technique, which is a hybrid of the firefly algorithm (FA), Levenberg–Marquardt (LM) backpropagation, and an artificial neural network (named as FA-LM-ANN), was used to construct the prediction model. The Bac Ha Bao Yen (BHBY) area in the northwestern region of Vietnam was used as a case study. Accordingly, a Geographical Information System (GIS) database was constructed using 12 input variables (elevation, slope, aspect, curvature, topographic wetness index, stream power index, toposhade, stream density, rainfall, normalized difference vegetation index, soil type, and lithology) and subsequently the output of flood inundation areas was mapped. Using the database and FA-LM-ANN, the flash flood model was trained and verified. The model performance was validated via various performance metrics including the classification accuracy rate, the area under the curve, precision, and recall. Then, the flash flood model that produced the highest performance was compared with benchmarks, indicating that the combination of FA and LM backpropagation is proven to be very effective and the proposed FA-LM-ANN is a new and useful tool for predicting flash flood susceptibility. [ABSTRACT FROM AUTHOR]
Published: 2018
Full Text: View/download PDF

41. Land Subsidence Susceptibility Mapping in South Korea Using Machine Learning Algorithms.

Author: Tien Bui, Dieu, Shahabi, Himan, Shirzadi, Ataollah, Chapi, Kamran, Pradhan, Biswajeet, Chen, Wei, Khosravi, Khabat, Panahi, Mahdi, Bin Ahmad, Baharin, and Saro, Lee
Subjects: *MACHINE learning, *COMPUTER algorithms, *SUPPORT vector machines, *LAND subsidence, *GEOGRAPHIC information systems
Abstract: In this study, land subsidence susceptibility was assessed for a study area in South Korea by using four machine learning models including Bayesian Logistic Regression (BLR), Support Vector Machine (SVM), Logistic Model Tree (LMT) and Alternate Decision Tree (ADTree). Eight conditioning factors were distinguished as the most important affecting factors on land subsidence of Jeong-am area, including slope angle, distance to drift, drift density, geology, distance to lineament, lineament density, land use and rock-mass rating (RMR) were applied to modelling. About 24 previously occurred land subsidence were surveyed and used as training dataset (70% of data) and validation dataset (30% of data) in the modelling process. Each studied model generated a land subsidence susceptibility map (LSSM). The maps were verified using several appropriate tools including statistical indices, the area under the receiver operating characteristic (AUROC) and success rate (SR) and prediction rate (PR) curves. The results of this study indicated that the BLR model produced LSSM with higher acceptable accuracy and reliability compared to the other applied models, even though the other models also had reasonable results. [ABSTRACT FROM AUTHOR]
Published: 2018
Full Text: View/download PDF

42. Landslide Susceptibility Assessment at Mila Basin (Algeria): A Comparative Assessment of Prediction Capability of Advanced Machine Learning Methods.

Author: Merghadi, Abdelaziz, Abderrahmane, Boumezbeur, and Tien Bui, Dieu
Subjects: LANDSLIDES, CARTOGRAPHY, MACHINE learning
Abstract: Landslide risk prevention requires the delineation of landslide-prone areas as accurately as possible. Therefore, selecting a method or a technique that is capable of providing the highest landslide prediction capability is highly important. The main objective of this study is to assess and compare the prediction capability of advanced machine learning methods for landslide susceptibility mapping in the Mila Basin (Algeria). First, a geospatial database was constructed from various sources. The database contains 1156 landslide polygons and 16 conditioning factors (altitude, slope, aspect, topographic wetness index (TWI), landforms, rainfall, lithology, stratigraphy, soil type, soil texture, landuse, depth to bedrock, bulk density, distance to faults, distance to hydrographic network, and distance to road networks). Subsequently, the database was randomly resampled into training sets and validation sets using 5 times repeated 10 k-folds cross-validations. Using the training and validation sets, five landslide susceptibility models were constructed, assessed, and compared using Random Forest (RF), Gradient Boosting Machine (GBM), Logistic Regression (LR), Artificial Neural Network (NNET), and Support Vector Machine (SVM). The prediction capability of the five landslide models was assessed and compared using the receiver operating characteristic (ROC) curve, the area under the ROC curves (AUC), overall accuracy (Acc), and kappa index. Additionally, Wilcoxon signed-rank tests were performed to confirm statistical significance in the differences among the five machine learning models employed in this study. The result showed that the GBM model has the highest prediction capability (AUC = 0.8967), followed by the RF model (AUC = 0.8957), the NNET model (AUC = 0.8882), the SVM model (AUC = 0.8818), and the LR model (AUC = 0.8575). Therefore, we concluded that GBM and RF are the most suitable for this study area and should be used to produce landslide susceptibility maps. These maps as a technical framework are used to develop countermeasures and regulatory policies to minimize landslide damages in the Mila Basin. This research demonstrated the benefit of selecting the best-advanced machine learning method for landslide susceptibility assessment. [ABSTRACT FROM AUTHOR]
Published: 2018
Full Text: View/download PDF

43. Assessment of advanced random forest and decision tree algorithms for modeling rainfall-induced landslide susceptibility in the Izu-Oshima Volcanic Island, Japan.

Author: Dou, Jie, Yunus, Ali P., Tien Bui, Dieu, Merghadi, Abdelaziz, Sahana, Mehebub, Zhu, Zhongfan, Chen, Chi-Wen, Khosravi, Khabat, Yang, Yong, and Pham, Binh Thai
Abstract: Abstract Landslides represent a part of the cascade of geological hazards in a wide range of geo-environments. In this study, we aim to investigate and compare the performance of two state-of-the-art machine learning models, i.e., decision tree (DT) and random forest (RF) approaches to model the massive rainfall-triggered landslide occurrences in the Izu-Oshima Volcanic Island, Japan at a regional scale. At first, a landslide inventory map is prepared consisting of 44 landslide polygons (10,444 pixels) from aerial photo-interpretation and field surveys. To estimate the robustness of the models, we randomly adapted two different samples (S1 and S2), comprising of both positive and negative cells (70% of total landslides - 7293 pixels) for training and remaining (30%–3151 pixels) for validation. Twelve causative factors including altitude, slope angle, slope aspect, plan curvature, total curvature, compound topographic index, stream power index, distance to drainage network, drainage density, distance to geological boundaries, lithology and cumulative rainfall were selected as predictors to implement the landslide susceptibility model. The area under the receiver operating characteristics (ROC) curves (AUC) and other statistical signifiers were used to verify the model accuracies. The result shows that the DT and RF models achieved remarkable predictive performance (AUC > 0.9), producing near accurate susceptibility maps. The overall efficiency of RF (AUC = 0.956) is found significantly higher than the DT (AUC = 0.928) results. Additionally, we noticed that the performance of RF for modeling landslide susceptibility is very robust even though the training and validation samples are altered. Considering the performances, we suggest that both RF and DT models can be used in other similar non-eruption-related landslide studies in the tephra-deposited rich volcanoes, as they are capable of rapidly generating accurate and stable LSM maps for risk mitigation, management practices, and decision-making. Moreover, the RF-based model is promising and enough to be recommended as a method to map regional landslide susceptibility. Graphical abstract Unlabelled Image Highlights • Decision tree and random forest models applied to map landslide-prone areas in a volcanic Island. • Two sample set (S1, and S2) for computing the robustness of the model • LSM maps were compared using different assessment principles. • Random forest performs better on both samples with AUC > 0.9. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

44. Spatial prediction of landslides using a hybrid machine learning approach based on Random Subspace and Classification and Regression Trees.

Author: Pham, Binh Thai, Prakash, Indra, and Tien Bui, Dieu
Subjects: *MACHINE learning, *LANDSLIDES, *GEOMORPHOLOGY, *PHYSICAL geology, *SUPPORT vector machines, *CHI-squared test
Abstract: A hybrid machine learning approach of Random Subspace (RSS) and Classification And Regression Trees (CART) is proposed to develop a model named RSSCART for spatial prediction of landslides. This model is a combination of the RSS method which is known as an efficient ensemble technique and the CART which is a state of the art classifier. The Luc Yen district of Yen Bai province, a prominent landslide prone area of Viet Nam, was selected for the model development. Performance of the RSSCART model was evaluated through the Receiver Operating Characteristic (ROC) curve, statistical analysis methods, and the Chi Square test. Results were compared with other benchmark landslide models namely Support Vector Machines (SVM), single CART, Naïve Bayes Trees (NBT), and Logistic Regression (LR). In the development of model, ten important landslide affecting factors related with geomorphology, geology and geo-environment were considered namely slope angles, elevation, slope aspect, curvature, lithology, distance to faults, distance to rivers, distance to roads, and rainfall. Performance of the RSSCART model ( AUC = 0.841) is the best compared with other popular landslide models namely SVM (0.835), single CART (0.822), NBT (0.821), and LR (0.723). These results indicate that performance of the RSSCART is a promising method for spatial landslide prediction. [ABSTRACT FROM AUTHOR]
Published: 2018
Full Text: View/download PDF

45. A comparative study of different machine learning methods for landslide susceptibility assessment: A case study of Uttarakhand area (India).

Author: Pham, Binh Thai, Pradhan, Biswajeet, Tien Bui, Dieu, Prakash, Indra, and Dholakia, M.B.
Subjects: *MACHINE learning, *LANDSLIDES, *SUPPORT vector machines, *LOGISTIC regression analysis, *DISCRIMINANT analysis, *NAIVE Bayes classification
Abstract: Landslide susceptibility assessment of Uttarakhand area of India has been done by applying five machine learning methods namely Support Vector Machines (SVM), Logistic Regression (LR), Fisher's Linear Discriminant Analysis (FLDA), Bayesian Network (BN), and Naïve Bayes (NB). Performance of these methods has been evaluated using the ROC curve and statistical index based methods. Analysis and comparison of the results show that all five landslide models performed well for landslide susceptibility assessment (AUC = 0.910–0.950). However, it has been observed that the SVM model (AUC = 0.950) has the best performance in comparison to other landslide models, followed by the LR model (AUC = 0.922), the FLDA model (AUC = 0.921), the BN model (AUC = 0.915), and the NB model (AUC = 0.910), respectively. [ABSTRACT FROM AUTHOR]
Published: 2016
Full Text: View/download PDF

46. Spatial predicting of flood potential areas using novel hybridizations of fuzzy decision-making, bivariate statistics, and machine learning.

Author: Costache, Romulus, Popa, Mihnea Cristian, Tien Bui, Dieu, Diaconu, Daniel Constantin, Ciubotaru, Nicu, Minea, Gabriel, and Pham, Quoc Bao
Subjects: *ANALYTIC hierarchy process, *PLANT hybridization, *MACHINE learning, *RECEIVER operating characteristic curves, *SUPPORT vector machines, *CLIMATE change, *FUZZY decision making, *PROPORTIONAL hazards models
Abstract: • Flood susceptibility was derived through IOE, FAHP, FAHP-IOE, SVM and SVM-IOE models. • 205 flood pixels were the dependent variable into stand-alone and ensemble models. • Between 13% and 20% of the study area has a high and very high flood susceptibility. • SVM-IOE was the most performant model. The global warming and climate changes determined a considerable increase in the frequency of floods and their related damages. Therefore, the high accuracy prediction of flood susceptible areas plays a key role in flood warnings and risk reduction. The main objective of this study is to propose novel hybridizations of fuzzy Analytical Hierarchy Process (FAHP), Index of Entropy (IoE), and Support Vector Machine (SVM) for predicting the areas susceptible to floods. Buzău river catchment (Romania) was the area on which the present study was focused. In this regard, a database with 205 flooded locations, 205 non-flood locations and 12 flood predictors was established and used to train and validate the flood susceptibility models. The performance of the proposed models was evaluated using the Receiver Operating Characteristic (ROC) Curve and statistical metrics. The results show that all the hybrid models have a high prediction performance and outperform the stand-alone models. Among them, the SVM-IoE model (AUC = 0.979) has the highest performance, followed by the FAHP-IoE (AUC = 0.97), IoE (AUC = 0.969), SVM (AUC = 0.966) and FAHP (AUC = 0.947). These results highlight a very high efficiency of all the applied models. The application of the models mentioned above revealed that a percentage between 12.5% (FPI IoE) and 21.2% (FPI FAHP) of the study area is characterized by high and very high exposure to these hydrological hazards. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

47. Predicting uncertainty of machine learning models for modelling nitrate pollution of groundwater using quantile regression and UNEEC methods.

Author: Rahmati, Omid, Choubin, Bahram, Fathabadi, Abolhasan, Coulon, Frederic, Soltani, Elinaz, Shahabi, Himan, Mollaefar, Eisa, Tiefenbacher, John, Cipullo, Sabrina, Ahmad, Baharin Bin, and Tien Bui, Dieu
Abstract: Although estimating the uncertainty of models used for modelling nitrate contamination of groundwater is essential in groundwater management, it has been generally ignored. This issue motivates this research to explore the predictive uncertainty of machine-learning (ML) models in this field of study using two different residuals uncertainty methods: quantile regression (QR) and uncertainty estimation based on local errors and clustering (UNEEC). Prediction-interval coverage probability (PICP), the most important of the statistical measures of uncertainty, was used to evaluate uncertainty. Additionally, three state-of-the-art ML models including support vector machine (SVM), random forest (RF), and k -nearest neighbor (k NN) were selected to spatially model groundwater nitrate concentrations. The models were calibrated with nitrate concentrations from 80 wells (70% of the data) and then validated with nitrate concentrations from 34 wells (30% of the data). Both uncertainty and predictive performance criteria should be considered when comparing and selecting the best model. Results highlight that the k NN model is the best model because not only did it have the lowest uncertainty based on the PICP statistic in both the QR (0.94) and the UNEEC (in all clusters, 0.85–0.91) methods, but it also had predictive performance statistics (RMSE = 10.63, R2 = 0.71) that were relatively similar to RF (RMSE = 10.41, R2 = 0.72) and higher than SVM (RMSE = 13.28, R2 = 0.58). Determining the uncertainty of ML models used for spatially modelling groundwater-nitrate pollution enables managers to achieve better risk-based decision making and consequently increases the reliability and credibility of groundwater-nitrate predictions. Unlabelled Image • Predictive uncertainty of models was estimated using the QR and UNEEC methods. • Random Forest model had the lower uncertainty band width based on the both methods. • Groundwater nitrate (NO 3) concentrations were predicted using RF, SVM, and K NN. • Random Forest model outperformed other models in terms of predictive performance. • Hydraulic conductivity and elevation had the highest contribution to the modelling. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

48. Novel ensembles of COPRAS multi-criteria decision-making with logistic regression, boosted regression tree, and random forest for spatial prediction of gully erosion susceptibility.

Author: Arabameri, Alireza, Yamani, Mojtaba, Pradhan, Biswajeet, Melesse, Assefa, Shirani, Kourosh, and Tien Bui, Dieu
Abstract: Gully erosion is considered as a severe environmental problem in many areas of the world which causes huge damages to agricultural lands and infrastructures (i.e. roads, buildings, and bridges); however, gully erosion modeling and prediction with high accuracy are still difficult due to the complex interactions of various factors. The objective of this research was to develop and introduce three new ensemble models, which were based on Complex Proportional Assessment of Alternatives (COPRAS), Logistic Regression (LR), Boosted Regression Tree (BRT), Random Forest (RF), and Frequency Ratio (FR) for spatial prediction of gully erosion with a case study at the Najafabad watershed (Iran). For this purpose, a total of 290 head-cut of gullies and 17 conditioning factors were collected and used to establish a geospatial database. Subsequently, FR was used to determine the spatial relationship between the conditioning factors and the head-cut of gullies, whereas RF, BRT, and LR were used to quantify the relative importance of these factors. In the next step, three ensemble gully erosion models, named COPRAS-FR-RF, COPRAS-FR-BRT, and COPRAS-FR-LR were developed and verified. The Success Rate Curve (SRC), and the Prediction Rate Curve (PRC) and their areas under the curves (AUC) were used to check the performance of the three proposed models. The result showed that Soil group, geomorphology, and drainage density factors played the key role on the occurrence of the gully erosion. All the three models have very high degree-of-fit and the prediction performance, the COPRAS-FR-RF model (AUC-SRC = 0.974 and AUC-PRC = 0.929), the COPRAS-FR-BRT model (AUC-SRC = 0.973 and AUC-PRC = 0.928), and the COPRAS-FR-LR model (AUC-SRC = 0.972 and AUC-PRC = 0.926); therefore, it is concluded that they are efficient and new powerful tools which could be used for predicting gully erosion in prone-areas. Unlabelled Image • RF, BRT, and LR are effective methods in determining the importance of the criteria for the COPRAS. • Three new ensemble models, COPRAS-FR-RF, COPRAS-FR-BRT, and COPRAS-FR-LR, were proposed for gully erosion modeling. • The proposed models provide >92% prediction accuracy. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

49. Prediction of shear strength of soft soil using machine learning methods.

Author: Pham, Binh Thai, Son, Le Hoang, Hoang, Tuan-Anh, Nguyen, Duc-Manh, and Tien Bui, Dieu
Subjects: *ARTIFICIAL neural networks, *MACHINE learning, *PARTICLE swarm optimization, *SOIL ecology, *SHEAR strength of soils
Abstract: Shear strength of the soil is an important engineering parameter used in the design and audit of geo-technical structures. In this research, we aim to investigate and compare the performance of four machine learning methods, Particle Swarm Optimization - Adaptive Network based Fuzzy Inference System (PANFIS), Genetic Algorithm - Adaptive Network based Fuzzy Inference System (GANFIS), Support Vector Regression (SVR), and Artificial Neural Networks (ANN), for predicting the strength of soft soils. For this purpose, case studies of 188 plastic clay soil samples collected from two major projects, Nhat Tan and Cua Dai bridges in Viet Nam have been used for generating training and testing datasets for constructing and validating the models. Validation and comparison of the models have been carried out using RMSE, and R. The results show that the PANFIS has the highest prediction capability (RMSE = 0.038 and R = 0.601), followed by the GANFIS (RMSE = 0.04 and R = 0.569), SVR (RMSE = 0.044 and R = 0.549), and ANN (RMSE = 0.059 and R = 0.49). It can be concluded that out of four models the PANFIS indicates as a promising technique for prediction of the strength of soft soils. [ABSTRACT FROM AUTHOR]
Published: 2018
Full Text: View/download PDF

50. A comparative assessment of decision trees algorithms for flash flood susceptibility modeling at Haraz watershed, northern Iran.

Author: Khosravi, Khabat, Pham, Binh Thai, Chapi, Kamran, Shirzadi, Ataollah, Shahabi, Himan, Revhaug, Inge, Prakash, Indra, and Tien Bui, Dieu
Subjects: *GROUNDWATER quality, *FLOOD control, *WATERSHED management, *GROUNDWATER analysis, *GEOMORPHOLOGY
Abstract: Floods are one of the most damaging natural hazards causing huge loss of property, infrastructure and lives. Prediction of occurrence of flash flood locations is very difficult due to sudden change in climatic condition and manmade factors. However, prior identification of flood susceptible areas can be done with the help of machine learning techniques for proper timely management of flood hazards. In this study, we tested four decision trees based machine learning models namely Logistic Model Trees (LMT), Reduced Error Pruning Trees (REPT), Naïve Bayes Trees (NBT), and Alternating Decision Trees (ADT) for flash flood susceptibility mapping at the Haraz Watershed in the northern part of Iran. For this, a spatial database was constructed with 201 present and past flood locations and eleven flood-influencing factors namely ground slope, altitude, curvature, Stream Power Index (SPI), Topographic Wetness Index (TWI), land use, rainfall, river density, distance from river, lithology, and Normalized Difference Vegetation Index (NDVI) . Statistical evaluation measures, the Receiver Operating Characteristic (ROC) curve, and Freidman and Wilcoxon signed-rank tests were used to validate and compare the prediction capability of the models. Results show that the ADT model has the highest prediction capability for flash flood susceptibility assessment, followed by the NBT, the LMT, and the REPT, respectively. These techniques have proven successful in quickly determining flood susceptible areas. [ABSTRACT FROM AUTHOR]
Published: 2018
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

75 results on '"Tien Bui, Dieu"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources