296 results on '"Random subspace"'
Search Results
2. Characterization of the Freshness of Pork by Near-Infrared Spectroscopy (NIRS) and Ensemble Learning.
- Author
-
Tan, Cheng, Chen, Hui, Zeng, Miao, and Xue, Zhi
- Subjects
- *
PERISHABLE foods , *INDEPENDENT sets , *CHEMOMETRICS , *PORK , *SPECTROMETRY - Abstract
AbstractPork is a perishable food and often needs to be stored in the refrigerator to maintain its quality as much as possible. Traditional methods for discriminating fresh and refrigerated pork are subjective, time-consuming, or destructive. The feasibility of using near-infrared (NIR) spectroscopy combined with chemometrics was explored to discriminate fresh and refrigerated pork. A total of 104 samples including 40 fresh and 64 refrigerated samples were first prepared and split into the training and test sets. Both partial least squares (PLS) and a subspace-based ensemble algorithm were used to establish classifiers. Also, both the number of learners and the size of subspace were optimized for ensemble modeling. On the independent test set, three measures, that is, the sensitivity, specificity, and total accuracy of the ensemble classifier were 95%, 93.8%, and 94.2%, respectively, each of which is superior to that of the PLS classifier. In addition, the influence of training set composition on classifier performance was also studied, indicating that ensemble modeling is robust. The results show that the NIR spectroscopy coupled with such an ensemble model can serve as a potential tool of discriminating fresh and refrigerated pork. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
3. Multi-label Random Subspace Ensemble Classification.
- Author
-
Bi, Fan, Zhu, Jianan, and Feng, Yang
- Subjects
- *
ARTIFICIAL neural networks , *CLASSIFICATION algorithms , *K-nearest neighbor classification , *ALGORITHMS , *CLASSIFICATION - Abstract
AbstractIn this work, we develop a new ensemble learning framework,
multi-label Random Subspace Ensemble (mRaSE), for multi-label classification. Given a base classifier (e.g., multinomial logistic regression, classification tree,K -nearest neighbors), mRaSE works by first randomly sampling a collection of subspaces, then choosing the best ones that achieve the minimum cross-validation errors and, finally, aggregating the chosen weak learners. In addition to its superior prediction performance, mRaSE also provides a model-free feature ranking depending on the given base classifier. An iterative version of mRaSE is also developed to further improve the performance. A model-free extension is pursued on the iterative version, leading to the so-calledSuper mRaSE , which accepts a collection of base classifiers as input to the algorithm. We show the proposed algorithms compared favorably with the state-of-the-art classification algorithm including random forest and deep neural network, via extensive simulation studies and two real data applications. The new algorithms are implemented in an updated version of the R package RaSEn. [ABSTRACT FROM AUTHOR]- Published
- 2024
- Full Text
- View/download PDF
4. Application of Big Data Technology in Internet Financial Risk Control.
- Author
-
Jingjing Chen, Guixian Tian, and Jia Wang
- Subjects
INFORMATION technology ,DATA structures ,FINANCIAL risk ,LOSS control ,SOFT sets - Abstract
Big data technique is now a prevalent concentration for specialist and academics due to the development of information technology, cloud computing, and Internet of Things technologies. A fiscal risk controlling model using MSHDS-RS, was innovatively proposed to deal with the current situation of unreasonable design of features in risk controlling technique. This model's innovation is that the model utilizes a normalized sparse approach for optimizing feature fusion after drawing loan customer information sources' hard and soft features, thereby forming integrated features. Then, the feature subset derived from probability sampling is trained as a base classifier, and the results of the base classifier are fused and optimized using evidence reasoning rules. MSHDS-RS's accuracy improvement rate was about 3.0% and 3.6% higher than existing PMB-RS methods', respectively, by observing MSHDS-RS's operating results in different feature sets with soft and integrated feature indicators. Therefore, the proposed optimization fusion method is reliable and feasible. This research contributes to the control of internet financial risks and has certain value in making effective decisions on loan platforms. [ABSTRACT FROM AUTHOR]
- Published
- 2024
5. Novel Ensemble Models Based on the Split‐Point Sampling and Node Attribute Subsampling Classifier for Groundwater Potential Mapping.
- Author
-
Wang, Zhengtao, Le, TienDuy, Tian, Kunjun, Phong, Tran Van, Bien, Tran Xuan, and Pham, Binh Thai
- Subjects
- *
MACHINE learning , *GROUNDWATER , *WATER supply , *WATERSHEDS , *RAINFALL - Abstract
Groundwater potential maps are crucial tools for effectively managing water resources, particularly in agriculturally focused countries such as Vietnam. However, creating these maps is a challenging task that requires reliable data and methods. In this study, we integrated the Split‐Point Sampling and Node Attribute Subsampling Classifier (SPAARC) with the Bagging (B), MultiBoostAB (MBAB), and Random Subspace (RSS) ensemble learning techniques and developed three ensemble models: B‐SPAARC, MBAB‐SPAARC, and RSS‐SPAARC. We selected 13 geoenvironmental factors based on their availability, relevance, and association with groundwater potential in the Sesan River basin of Vietnam. We assessed the models' performance using various metrics such as area under the curve (AUC), accuracy, sensitivity, specificity, and RMSE. The findings indicated that the ensemble models performed better than the single SPAARC model in mapping groundwater potential. The MBAB‐SPAARC model demonstrated the highest accuracy with an AUC value of 0.891, followed by B‐SPAARC (AUC = 0.844), RSS‐SPAARC (AUC = 0.871), and the single SPAARC (AUC = 0.853) models. The results also highlighted that elevation, rainfall, land use/cover, and altitude were the most significant factors for mapping groundwater potential in the Sesan River basin. The innovative ensemble models and reliable potential maps developed in this study assist water resource managers in planning water usage based on the benefits and costs for various users and in devising sustainable strategies for using, protecting, and managing groundwater. Key Points: Groundwater potential mapping was carried out at the Central Highlands of VietnamHybrid Machine learning Models based on Naïve Bayes Tree were developed and usedResults showed that the proposal models are potential and accurate tools [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
6. Predict Total Sediment Load Using Standalone and Ensemble Machine Learning Models
- Author
-
Kumar, Sanjit, Agarwal, Mayank, Deshpande, Vishal, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Das, Swagatam, editor, Saha, Snehanshu, editor, Coello Coello, Carlos A., editor, and Bansal, Jagdish C., editor
- Published
- 2024
- Full Text
- View/download PDF
7. A Randomised Non-descent Method for Global Optimisation
- Author
-
Pasechnyuk, Dmitry A., Gornov, Alexander, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Olenev, Nicholas, editor, Evtushenko, Yuri, editor, Jaćimović, Milojica, editor, Khachay, Michael, editor, and Malkova, Vlasta, editor
- Published
- 2024
- Full Text
- View/download PDF
8. 随机多属性子空间的ReliefF加权邻域粗糙集与属性约简.
- Author
-
王莉
- Abstract
Copyright of Journal of Computer Engineering & Applications is the property of Beijing Journal of Computer Engineering & Applications Journal Co Ltd. and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
9. Random Subspace Sampling for Classification with Missing Data.
- Author
-
Cao, Yun-Hao and Wu, Jian-Xin
- Subjects
STATISTICAL sampling ,CLASSIFICATION ,SAMPLING methods ,MISSING data (Statistics) - Abstract
Many real-world datasets suffer from the unavoidable issue of missing values, and therefore classification with missing data has to be carefully handled since inadequate treatment of missing values will cause large errors. In this paper, we propose a random subspace sampling method, RSS, by sampling missing items from the corresponding feature histogram distributions in random subspaces, which is effective and efficient at different levels of missing data. Unlike most established approaches, RSS does not train on fixed imputed datasets. Instead, we design a dynamic training strategy where the filled values change dynamically by resampling during training. Moreover, thanks to the sampling strategy, we design an ensemble testing strategy where we combine the results of multiple runs of a single model, which is more efficient and resource-saving than previous ensemble methods. Finally, we combine these two strategies with the random subspace method, which makes our estimations more robust and accurate. The effectiveness of the proposed RSS method is well validated by experimental studies. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
10. Learning from high dimensional data based on weighted feature importance in decision tree ensembles.
- Author
-
Pour, Nayiri Galestian and Shemehsavar, Soudabeh
- Subjects
- *
DECISION trees , *RANDOM forest algorithms , *IMAGE recognition (Computer vision) , *COMPUTATIONAL biology , *MACHINE learning , *DATA analysis - Abstract
Learning from high dimensional data has been utilized in various applications such as computational biology, image classification, and finance. Most classical machine learning algorithms fail to give accurate predictions in high dimensional settings due to the enormous feature space. In this article, we present a novel ensemble of classification trees based on weighted random subspaces that aims to adjust the distribution of selection probabilities. In the proposed algorithm base classifiers are built on random feature subspaces in which the probability that influential features will be selected for the next subspace, is updated by incorporating grouping information based on previous classifiers through a weighting function. As an interpretation tool, we show that variable importance measures computed by the new method can identify influential features efficiently. We provide theoretical reasoning for the different elements of the proposed method, and we evaluate the usefulness of the new method based on simulation studies and real data analysis. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
11. Integrating Support Vector Machines with Different Ensemble Learners for Improving Streamflow Simulation in an Ungauged Watershed.
- Author
-
Takai Eddine, Yahi, Nadir, Marouf, Sabah, Sehtal, and Jaafari, Abolfazl
- Subjects
SUPPORT vector machines ,STANDARD deviations ,MEDITERRANEAN climate ,WATERSHEDS - Abstract
Streamflow simulation, particularly in ungauged watersheds, poses a significant challenge in surface water hydrology. The estimation of natural river and streamflow has been a research focus in recent years, with numerous strategies proposed. Hybrid ensemble soft computing models have proven their effectiveness in predicting flow rates. This study proposes a modeling approach that integrates a support vector machine (SVM) with several ensemble learning techniques, such as Bagging, Dagging, Random subspace, and Rotation Forest, to predict flow rates in natural rivers of a Mediterranean climate in Algeria. The gauging data of the hydrometric station "Amont des gorges" were used, and the following quantitative parameters were considered: flow, velocity, depth, width, and hydraulic radius. The proposed models were evaluated based on Nash–Sutcliffe efficiency (NSE), root mean square error (RMSE), and correlation coefficient (R). Our results indicated that the ensemble models outperformed the standalone SVM model. More specifically, the SVM-Dagging model performed the best, with RMSE = 6.58, NSE = 0.76 and R = 0.96, followed by SVM-Bagging (RMSE = 6.83, NSE = 0.75, and R = 0.96), SVM-RF (RMSE = 6.95, NSE = 0.74, and R = 0.95), SVM-RSS (RMSE = 8.34, NSE = 0.62, and R = 0.93), and the standalone SVM models (RMSE = 7.71, NSE = 0.68, and R = 0.88), respectively. These findings suggest that the proposed ensemble models are valuable tools for accurately forecasting stream and river flows, aiding planners and decision-makers. Accurate prediction of flow rates in natural rivers can enhance water resource planning, optimize resource allocation, and improve water management practices. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
12. Modeling Flood Susceptible Areas Using Deep Learning Techniques with Random Subspace: A Case Study of the Mae Chan Basin in Thailand.
- Author
-
Surachai Chantee and Theeraya Mayakul
- Subjects
DEEP learning ,ARTIFICIAL neural networks ,MACHINE learning ,GEOGRAPHIC information systems ,WILCOXON signed-rank test ,FEATURE selection ,LAND cover - Abstract
Flooding is a recurring global issue that leads to substantial loss of life and property damage. A crucial tool in managing and mitigating the impact of flooding is using flood hazard maps, which help identify high-risk areas and enable effective planning and management. This study presents a study on developing a predictive model to identify flood-prone areas in the Mae Chan Basin of Thailand using machine learning techniques, precisely the random sub-space ensemble method combined with a deep neural network (RS-DNN) and Nadam optimizer. The model was trained using 11 geographic information system (GIS) layers, including rainfall, elevation, slope, distance from the river, soil group, NDVI, road density, curvature, land use, flow accumulation, geology, and flood inventory data. Feature selection was carried out using the Gain Ratio method. The model was validated using accuracy, precision, ROC, and AUC metrics. Using the Wilcoxon signed-rank test, the effectiveness was compared to other machine learning algorithms, including random tree and support vector machines. The results showed that the RS-DNN model achieved a higher classification accuracy of 97% in both the training and testing datasets, compared to random tree (93%) and SVM (82%). The model's performance was also validated by its high AUC value of (0.99), compared to a random tree (0.93) and SVM (0.82) at a significance level of 0.05. In conclusion, the RS-DNN model is a highly accurate tool for identifying flood-prone areas, aiding in effective flood management and planning. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
13. A Comparative Study of Genetic Algorithm-Based Ensemble Models and Knowledge-Based Models for Wildfire Susceptibility Mapping.
- Author
-
Al-Shabeeb, Abdel Rahman, Hamdan, Ibraheem, Meimandi Parizi, Sedigheh, Al-Fugara, A'kif, Odat, Sana'a, Elkhrachy, Ismail, Hu, Tongxin, and Sammen, Saad Sh.
- Abstract
Wildfire susceptibility mapping (WSM) plays a crucial role in identifying areas with heightened vulnerability to forest fires, allowing for proactive measures in fire prevention, management, and resource allocation, ultimately leading to more effective fire control and mitigation strategies. This paper describes our undertaking to develop and compare the performance of two knowledge-based models, namely the analytic hierarchy process (AHP) and the technique for order performance by similarity to ideal solution (TOPSIS), as well as two novel genetic algorithm (GA)-based ensemble data-driven models: boosting and random subspace. The objective was to map susceptibility to forest fires in the Northern Mazar District in Jordan. The ensemble models were constructed using four well-known classifiers: decision tree (DT), support vector machine (SVM), k-nearest neighbors (kNN), and naive Bayes (NB) algorithms. This study utilized seventy forest fire locations and twelve influential factors to build and evaluate the models. To identify the optimal features for constructing the data-driven models, a GA-based wrapper method and four machine learning models were applied. During the validation phase, the area under the receiver operating characteristic curve (AUROCC) values for the single SVM, single NB, single DT, single kNN, GA-based boosting, GA-based random subspace, FR-AHP, and AHP-TOPSIS models were found to be 85.3%, 85.9%, 73.8%, 88.7%, 95.0%, 95.0%, 74.0%, and 65.4% respectively. The results indicated that the GA-based ensemble models outperformed both the single machine learning models and the knowledge-based techniques in terms of performance. The developed models in this study can be effectively utilized in various management and decision-making processes aimed at mitigating forest fire risks and enhancing fire control strategies. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
14. Modeling Method for Whole Life Trajectory of Protection Equipment Based on Information Label
- Author
-
Zhang, Lie, Peng, Guo, Yang, Guosheng, Li, Zhongqing, Angrisani, Leopoldo, Series Editor, Arteaga, Marco, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Chakraborty, Samarjit, Series Editor, Chen, Jiming, Series Editor, Chen, Shanben, Series Editor, Chen, Tan Kay, Series Editor, Dillmann, Rüdiger, Series Editor, Duan, Haibin, Series Editor, Ferrari, Gianluigi, Series Editor, Ferre, Manuel, Series Editor, Hirche, Sandra, Series Editor, Jabbari, Faryar, Series Editor, Jia, Limin, Series Editor, Kacprzyk, Janusz, Series Editor, Khamis, Alaa, Series Editor, Kroeger, Torsten, Series Editor, Li, Yong, Series Editor, Liang, Qilian, Series Editor, Martín, Ferran, Series Editor, Ming, Tan Cher, Series Editor, Minker, Wolfgang, Series Editor, Misra, Pradeep, Series Editor, Möller, Sebastian, Series Editor, Mukhopadhyay, Subhas, Series Editor, Ning, Cun-Zheng, Series Editor, Nishida, Toyoaki, Series Editor, Oneto, Luca, Series Editor, Pascucci, Federica, Series Editor, Qin, Yong, Series Editor, Seng, Gan Woon, Series Editor, Speidel, Joachim, Series Editor, Veiga, Germano, Series Editor, Wu, Haitao, Series Editor, Zamboni, Walter, Series Editor, Zhang, Junjie James, Series Editor, Li, Jian, editor, Xie, Kaigui, editor, Hu, Jianlin, editor, and Yang, Qingxin, editor
- Published
- 2023
- Full Text
- View/download PDF
15. Landslide Susceptibility Assessment and Management Using Advanced Hybrid Machine Learning Algorithms in Darjeeling Himalaya, India
- Author
-
Saha, Anik, Saha, Sunil, Mandal, Sujit, editor, Maiti, Ramkrishna, editor, Nones, Michael, editor, and Beckedahl, Heinz R., editor
- Published
- 2022
- Full Text
- View/download PDF
16. Measuring Landslide Susceptibility of Phuentsholling, Bhutan Using Novel Ensemble Machine Learning Methods
- Author
-
Sarkar, Raju, Saha, Sunil, Roy, Jagabandhu, Bhardwaj, Dhruv, Shaw, Rajib, Series Editor, Sarkar, Raju, editor, and Pradhan, Biswajeet, editor
- Published
- 2022
- Full Text
- View/download PDF
17. Forecasting Long-Series Daily Reference Evapotranspiration Based on Best Subset Regression and Machine Learning in Egypt.
- Author
-
Elbeltagi, Ahmed, Srivastava, Aman, Al-Saeedi, Abdullah Hassan, Raza, Ali, Abd-Elaty, Ismail, and El-Rawy, Mustafa
- Subjects
EVAPOTRANSPIRATION ,WATER management ,AGRICULTURAL water supply ,MACHINE learning ,TREE pruning ,HYDROLOGIC cycle - Abstract
The estimation of reference evapotranspiration (ET
o ), a crucial step in the hydrologic cycle, is essential for system design and management, including the balancing, planning, and scheduling of agricultural water supply and water resources. When climates vary from arid to semi-arid, and there are problems with a lack of meteorological data and a lack of future information on ETo , as is the case in Egypt, it is more important to estimate ETo precisely. To address this, the current study aimed to model ETo for Egypt's most important agricultural governorates (Al Buhayrah, Alexandria, Ismailiyah, and Minufiyah) using four machine learning (ML) algorithms: linear regression (LR), random subspace (RSS), additive regression (AR), and reduced error pruning tree (REPTree). The Climate Forecast System Reanalysis (CFSR) of the National Centers for Environmental Prediction (NCEP) was used to gather daily climate data variables from 1979 to 2014. The datasets were split into two sections: the training phase, i.e., 1979–2006, and the testing phase, i.e., 2007–2014. Maximum temperature (Tmax ), minimum temperature (Tmin ), and solar radiation (SR) were found to be the three input variables that had the most influence on the outcome of subset regression and sensitivity analysis. A comparative analysis of ML models revealed that REPTree outperformed competitors by achieving the best values for various performance matrices during the training and testing phases. The study's novelty lies in the use of REPTree to estimate and predict ETo , as this algorithm has not been commonly used for this purpose. Given the sparse attempts to use this model for such research, the remarkable accuracy of the REPTree model in predicting ETo highlighted the rarity of this study. In order to combat the effects of aridity through better water resource management, the study also cautions Egypt's authorities to concentrate their policymaking on climate adaptation. [ABSTRACT FROM AUTHOR]- Published
- 2023
- Full Text
- View/download PDF
18. Performance of Machine Learning Techniques for Meteorological Drought Forecasting in the Wadi Mina Basin, Algeria.
- Author
-
Achite, Mohammed, Elshaboury, Nehal, Jehanzaib, Muhammad, Vishwakarma, Dinesh Kumar, Pham, Quoc Bao, Anh, Duong Tran, Abdelkader, Eslam Mohammed, and Elbeltagi, Ahmed
- Subjects
WATER management ,DESERTIFICATION ,DROUGHT forecasting ,MACHINE performance ,MACHINE learning ,SUPPORT vector machines ,SOIL degradation - Abstract
Water resources, land and soil degradation, desertification, agricultural productivity, and food security are all adversely influenced by drought. The prediction of meteorological droughts using the standardized precipitation index (SPI) is crucial for water resource management. The modeling results for SPI at 3, 6, 9, and 12 months are based on five types of machine learning: support vector machine (SVM), additive regression, bagging, random subspace, and random forest. After training, testing, and cross-validation at five folds on sub-basin 1, the results concluded that SVM is the most effective model for predicting SPI for different months (3, 6, 9, and 12). Then, SVM, as the best model, was applied on sub-basin 2 for predicting SPI at different timescales and it achieved satisfactory outcomes. Its performance was validated on sub-basin 2 and satisfactory results were achieved. The suggested model performed better than the other models for estimating drought at sub-basins during the testing phase. The suggested model could be used to predict meteorological drought on several timescales, choose remedial measures for research basin, and assist in the management of sustainable water resources. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
19. Credal-Decision-Tree-Based Ensembles for Spatial Prediction of Landslides.
- Author
-
Gui, Jingyun, Pérez-Rey, Ignacio, Yao, Miao, Zhao, Fasuo, and Chen, Wei
- Subjects
LANDSLIDES ,LANDSLIDE prediction ,LANDSLIDE hazard analysis ,RECEIVER operating characteristic curves ,DECISION trees - Abstract
Spatial landslide susceptibility assessment is a fundamental part of landslide risk management and land-use planning. The main objective of this study is to apply the Credal Decision Tree (CDT), adaptive boosting Credal Decision Tree (AdaCDT), and random subspace Credal Decision Tree (RSCDT) models to construct landslide susceptibility maps in Zhashui County, China. The observed 169 historical landslides were classified into two groups: 70% (118 landslides) for training and 30% (51 landslides) for validation. To compare and validate the performance of the three models, the receiver operating characteristic (ROC) curve and the area under the ROC curve (AUC) were utilized. Specifically, the success rates of the CDT model, AdaCDT model, and RSCDT model were 0.788, 0.821, and 0.847, respectively, while the corresponding prediction rates were 0.771, 0.802, and 0.861, respectively. In sum, the two ensemble models can effectively improve the performance accuracy of an individual CDT model, and the RSCDT model was proven to be superior to the other two models. Therefore, ensemble models are capable of being novel and promising approaches for the spatial prediction and zonation of a certain region's landslide susceptibility. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
20. Enhancing trustworthiness among iot network nodes with ensemble deep learning-based cyber attack detection.
- Author
-
Malathi, Dr S and Begum, S. Razool
- Subjects
- *
COMPUTER network traffic , *CONVOLUTIONAL neural networks , *TELECOMMUNICATION , *CYBER physical systems , *EXPERT systems - Abstract
A lot of machine learning methods and expert systems are used in network intrusion detection automation. When different industrial control systems merge with the Internet of Things (IoT) environment, they become vulnerable to cyber-attacks in critical infrastructure situations requiring communication technologies. Conventional machine learning techniques used for network anomaly detection are ineffective due to the substantial amount of network traffic within important Cyber-Physical Systems (CPSs). In this manuscript, Cyberattack Identification Through Ensemble Deep Learning in an IoT environment is proposed. Initially, the input network traffic data are taken from the IoT-23 dataset. Then the network traffic data are preprocessed using Z-Score normalization to reduce any irrelevant or erroneous data from the input dataset. Then the relevant features are selected using the Gorilla Troops Optimization (GTO) algorithm. Afterwards, the selected features are fed into the ensemble classification model based on Random Space (RS), Random Tree (RT), Extreme Gradient Boosting (XGBoost), and Graph Convolutional Neural Network (GCNN). Among the several ensembling techniques, GCNN can achieve better performance. Python is used to accomplish the suggested technique. The performance of the proposed GCNN method provides 12.09%, 4.34%, and 3.21% higher accuracy than the other models like RS, RT and XGBoost respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
21. Groundwater potentiality mapping using ensemble machine learning algorithms for sustainable groundwater management
- Author
-
Showmitra Kumar Sarkar, Swapan Talukdar, Atiqur Rahman, Shahfahad, and Sujit Kumar Roy
- Subjects
Groundwater potentiality ,Data mining ,GIS ,Remote sensing ,Random subspace ,Engineering (General). Civil engineering (General) ,TA1-2040 - Abstract
Purpose – The present study aims to construct ensemble machine learning (EML) algorithms for groundwater potentiality mapping (GPM) in the Teesta River basin of Bangladesh, including random forest (RF) and random subspace (RSS). Design/methodology/approach – The RF and RSS models have been implemented for integrating 14 selected groundwater condition parametres with groundwater inventories for generating GPMs. The GPM were then validated using the empirical and bionormal receiver operating characteristics (ROC) curve. Findings – The very high (831–1200 km2) and high groundwater potential areas (521–680 km2) were predicted using EML algorithms. The RSS (AUC-0.892) model outperformed RF model based on ROC's area under curve (AUC). Originality/value – Two new EML models have been constructed for GPM. These findings will aid in proposing sustainable water resource management plans.
- Published
- 2022
- Full Text
- View/download PDF
22. Drought indicator analysis and forecasting using data driven models: case study in Jaisalmer, India.
- Author
-
Elbeltagi, Ahmed, Kumar, Manish, Kushwaha, N. L., Pande, Chaitanya B., Ditthakit, Pakorn, Vishwakarma, Dinesh Kumar, and Subeesh, A.
- Subjects
- *
DROUGHT management , *DROUGHT forecasting , *DROUGHTS , *WEATHER forecasting , *IRRIGATION scheduling , *DATA modeling , *TREE pruning , *RANDOM forest algorithms - Abstract
Agricultural droughts are a prime concern for economies worldwide as they negatively impact the productivity of rain-fed crops, employment, and income per capita. In this study, Standard Precipitation Index (SPI) has been used to evaluate different drought indices for Rajasthan of India. In agricultural, hydrological, and meteorological applications such as irrigation scheduling, crop simulation, water budgeting, reservoir operations, and weather forecasting, the accurate estimation of the drought indices such as the Standardized Precipitation Index (SPI) plays an important role. Thus, the present study was conducted to examine the feasibility and effectiveness of the Random Subspace (RSS) model and its hybridization with the M5 Pruning tree (M5P), Random Forest (RF), and Random Tree (RT) to estimate the SPI at 3, 6, and 12 droughts during 2000–2019. Performances of RSS and hybridized algorithms were assessed and compared using performance indicators (i.e., MAE, RMSE, RAE, RRSE, and R2) and various graphical interpretations. Results indicated that the RSS-M5P provided the most accurate SPI prediction (MAE = 0.497, RMSE = 0.682, RAE = 81.88, RRSE = 87.22, and R2 = 0.507 for SPI-3; MAE = 0.452, RMSE = 0.717, RAE = 69.76, RRSE = 85.24, and R2 = 0.402 for SPI-6. And MAE = 0.294, RMSE = 0.377, RAE = 55.79, RRSE = 59.57, and R2 = 0.783 for SPI-12) compare to RSS alone, RSS-RF, and RSS-RT models for study the drought situation in Jaisalmer Rajasthan. The M5P algorithms have improved the performance of the RSS structure. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
23. Evaluation of Data-driven Hybrid Machine Learning Algorithms for Modelling Daily Reference Evapotranspiration.
- Author
-
Kushwaha, Nand Lal, Rajput, Jitendra, Sena, D.R., Elbeltagi, Ahmed, Singh, D.K., and Mani, Indra
- Abstract
Copyright of Atmosphere -- Ocean (Taylor & Francis Ltd) is the property of Taylor & Francis Ltd and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2022
- Full Text
- View/download PDF
24. Breast cancer prediction from microRNA profiling using random subspace ensemble of LDA classifiers via Bayesian optimization.
- Author
-
Sharma, Sudhir Kumar, Vijayakumar, K., Kadam, Vinod J., and Williamson, Sheldon
- Subjects
FISHER discriminant analysis ,BREAST cancer ,NON-coding RNA ,MICRORNA ,TUMOR classification ,CANCER diagnosis - Abstract
Breast cancer rates are rising. It also remains the second principal reason for cancer-related mortality in females, and the mortality rate is also drastically rising. In recent years, MicroRNAs (miRNAs) have emerged to have a large potential as biomarkers because of their effective roles in human disease diagnosis (including breast cancer). miRNAs are small (short), regulatory, and evolutionarily conserved non-coding RNAs (ncRNAs) molecules (with a length of about 22 nucleotides) that are present in all eukaryotic cells. There are many studies available in the literature that focus on recent circulating miRNAs research, their relationships to human diseases, their role as a potential biomarker, etc. Therefore, in this study we used three key techniques for classification of breast cancer using miRNAs features: Linear Discriminant Analysis (LDA), Random Subspace Ensemble (RSE) and Bayesian Hyperparameter optimization (BHO). Linear Discriminant Analysis (LDA) is a simple but most practical and computationally attractive classification approach. Random Subspace Ensemble (RSE) is capable of producing a robust ensemble for classification. Some previous research showed applications of Bayesian optimization in many engineering optimization problems. Notably, it is a recently applied for hyperparameter tuning in various ensemble classifiers. Therefore, the potential application of the RSE of LDA classifiers (LDA as a base classifier) with BHO method to boost the predicting accuracy of breast cancer diagnosis using miRNAs profiling dataset, has been studied in this study. A publicly available dataset of serum miRNA profiles obtained from the GEO dataset (accession code GSE106817) has been applied for validation. A variety of output measurements were employed to determine the performances and efficiencies of the proposed model and other classifiers. The proposed approach exhibited successful overall performance. The results were directly compared with the individual LDA classifier and other established state-of-the-art classifiers. The outcomes point out that the approach is superior in terms of different efficiency indicators to the LDA and all established state-of-the-art models used in the study. Study simulations, outcomes, and mathematical investigations have illustrated that the technique presented is a practical and advantageous model for the classification of breast cancer from miRNA profiling. This model may usefully be employed in other cancer classifications from miRNA profiling. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
25. Landslide Susceptibility Modeling Using Remote Sensing Data and Random SubSpace-Based Functional Tree Classifier.
- Author
-
Peng, Tao, Chen, Yunzhi, and Chen, Wei
- Subjects
- *
LANDSLIDES , *LANDSLIDE hazard analysis , *NORMALIZED difference vegetation index , *REMOTE sensing , *RECEIVER operating characteristic curves , *REGRESSION trees - Abstract
In this study, a random subspace-based function tree (RSFT) was developed for landslide susceptibility modeling, and by comparing with a bagging-based function tree (BFT), classification regression tree (CART), and Naïve-Bayes tree (NBTree) Classifier, to judge the performance difference between the hybrid model and the single models. In the first step, according to the characteristics of the geological environment and previous literature, 12 landslide conditioning factors were selected, including aspect, slope, profile curvature, plan curvature, elevation, topographic wetness index (TWI), lithology, and normalized difference vegetation index (NDVI), land use, soil, distance to river and distance to the road. Secondly, 328 historical landslides were randomly divided into a training group and a validation group in a ratio of 70/30, and the important analysis of landslide points and conditional factors was carried out using the functional tree (FT) model. In the third step, all data are loaded into FT, RSFT, BFT, CART, and NBTree models for the generation of landslide susceptibility maps (LSM). Comparisons were made by the area under the receiver operating characteristic curve (AUC) to determine efficiency and effectiveness. According to the verification results, the five models selected this time all perform reasonably, but the RSFT model has the highest prediction rate (AUC = 0.838), which is better than the other three single machine learning models. The results of this study also demonstrated that the hybrid model generally improves the predictive power of the benchmark landslide susceptibility models. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
26. An automated clinical decision support system for predicting cardiovascular disease using ensemble learning approach.
- Subjects
CLINICAL decision support systems ,CARDIOVASCULAR diseases ,HEART disease diagnosis ,OUTLIER detection ,DISEASE risk factors ,K-nearest neighbor classification - Abstract
With the vast advancements in the medical domain, earlier prediction of disease plays a substantial role in enhancing the healthcare quality and assists in taking better decisions making during emergency times. Most of the existing research concentrates on modeling an automated prediction model for heart disease and the risk factors. Nevertheless, accurate classification is a vital challenge in heart disease diagnosis where the managing of high‐dimensional data increases the execution time of existing classifiers. In this paper, a new ensemble model has been proposed with the aid of random subspace and K‐nearest neighbor (RSS‐KNN) scheme for earlier prediction of heart disease. Primarily, the proposed scheme implements an isolation‐based outlier removal mechanism to eradicate the noises and outliers in the distributed data. Subsequently, the essential features are identified using RSS by varying the testing and training errors in the evaluation phase. The extracted features are then fed into KNN for the accurate classification of heart disease. Finally, an enhanced squirrel optimizer has been employed in the proposed scheme to obtain the global results which balance the exploration as well as exploitation issues and eliminate the over‐fitting problems. The simulation results manifest that the accuracy (without features) of the proposed ensemble RSS‐KNN scheme in the UCI ML dataset is 97.65%, accuracy (with features) is 98.56%, and specificity is 98.10% when compared with existing state‐of‐the‐art classifiers. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
27. Ensemble machine learning models based on Reduced Error Pruning Tree for prediction of rainfall-induced landslides
- Author
-
Binh Thai Pham, Abolfazl Jaafari, Trung Nguyen-Thoi, Tran Van Phong, Huu Duy Nguyen, Neelima Satyam, Md Masroor, Sufia Rehman, Haroon Sajjad, Mehebub Sahana, Hiep Van Le, and Indra Prakash
- Subjects
machine learning ,ensemble modeling ,bagging ,decorate ,random subspace ,Mathematical geography. Cartography ,GA1-1776 - Abstract
In this paper, we developed highly accurate ensemble machine learning models integrating Reduced Error Pruning Tree (REPT) as a base classifier with the Bagging (B), Decorate (D), and Random Subspace (RSS) ensemble learning techniques for spatial prediction of rainfall-induced landslides in the Uttarkashi district, located in the Himalayan range, India. To do so, a total of 103 historical landslide events were linked to twelve conditioning factors for generating training and validation datasets. Root Mean Square Error (RMSE) and Area Under the receiver operating characteristic Curve (AUC) were used to evaluate the training and validation performances of the models. The results showed that the single REPT model and its derived ensembles provided a satisfactory accuracy for the prediction of landslides. The D-REPT model with RMSE = 0.351 and AUC = 0.907 was identified as the most accurate model, followed by RSS-REPT (RMSE = 0.353 and AUC = 0.898), B-REPT (RMSE = 0.396 and AUC = 0.876), and the single REPT model (RMSE = 0.398 and AUC = 0.836), respectively. The prominent ensemble models proposed and verified in this study provide engineers and modelers with insights for development of more advanced predictive models for different landslide-susceptible areas around the world.
- Published
- 2021
- Full Text
- View/download PDF
28. Random Subspace Combined LDA Based Machine Learning Model for OSCC Classifier
- Author
-
Nawandhar, Archana, Kumar, Navin, Yamujala, Lakshmi, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Saha, Snehanshu, editor, Nagaraj, Nithin, editor, and Tripathi, Shikha, editor
- Published
- 2020
- Full Text
- View/download PDF
29. Application of an ensemble learning model based on random subspace and a J48 decision tree for landslide susceptibility mapping: a case study for Qingchuan, Sichuan, China.
- Author
-
Li, Yangchun, Lin, Feikai, Luo, Xiangang, Zhu, Shuang, Li, Jiang, Xu, Zhanya, Liu, Xiuwei, Luo, Shungen, Huo, Guangjie, Peng, Liangsheng, and Feng, Haiping
- Subjects
LANDSLIDE hazard analysis ,LANDSLIDES ,DECISION trees ,ARTIFICIAL neural networks - Abstract
Landslides are a serious natural hazard in the world. A map of landslide susceptibility can help to effectively reduce losses. In this paper, a hybrid ensemble technique based on random subspace (RS) and a J48 decision tree named RS–J48T was proposed for landslide susceptibility mapping. This model could enhance the effect of a single classifier significantly and solve the problem of overfitting. Qingchuan County, Sichuan (China) was taken as a study area. A geospatial database which consisted of 640 landslide locations and 12 factors was constructed for this study. The J48 decision tree, artificial neural network (ANN), and other ensemble techniques like AdaBoost and Bagging, were selected for comparison. Receiver operating curves and some statistical indices were used for model validation. The results showed that the RS–J48T model had the better fitting capability (AUC = 0.875), and the best prediction capability (AUC = 0.769) compared to other models. Overall, the novel hybrid model could be a promising way for generating landslide susceptibility maps for other prone areas. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
30. An Efficient Classification of MRI Brain Images
- Author
-
Muhammad Assam, Hira Kanwal, Umar Farooq, Said Khalid Shah, Arif Mehmood, and Gyu Sang Choi
- Subjects
Color moments (CMs) ,feed forward artificial neural network (FF-ANN) ,random subspace ,random forest ,baysnet ,principle component analysis (PCA) ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
The unprecedented improvements in computing capabilities and the introduction of advanced techniques for the analysis, interpretation, processing, and visualization of images have greatly diversified the domain of medical sciences and resulted in the field of medical imaging. The Magnetic Resonance Imaging (MRI), an advanced imaging technique, is capable of producing high quality images of the human body including the brain for diagnosis purposes. This paper proposes a simple but efficient solution for the classification of MRI brain images into normal, and abnormal images containing disorders and injuries. It uses images with brain tumor, acute stroke and alzheimer, besides normal images, from the public dataset developed by harvard medical school, for evaluation purposes. The proposed model is a four step process, in which the steps are named: 1). Pre-processing, 2). Features Extraction, 3). Features Reduction, and 4). Classification. Median filter, being one of the best algorithms, is used for the removal of noise such as salt and pepper, and unwanted components such as scalp and skull, in the pre-processing step. During this stage, the images are converted from gray scale to colored images for further processing. In second step, it uses Discrete Wavelet Transform (DWT) technique to extract different features from the images. In third stage, Color Moments (CMs) are used to reduce the number of features and get an optimal set of characteristics. Images with the optimal set of features are passed to different classifiers for the classification of images. The Feed Forward - ANN (FF-ANN), an individual classifier, which was given a 65% to 35% split ratio for training and testing, and hybrid classifiers called: Random Subspace with Random Forest (RSwithRF) and Random Subspace with Bayesian Network (RSwithBN), which used 10-Fold cross validation technique, resulted in 95.83%, 97.14% and 95.71% accurate classification, in corresponding order. These promising results show that the proposed method is robust and efficient, in comparison with, existing classification methods in terms of accuracy with smaller number of optimal features.
- Published
- 2021
- Full Text
- View/download PDF
31. Rotation forest of random subspace models.
- Author
-
Alexandropoulos, Stamatios-Aggelos N., Aridas, Christos K., Kotsiantis, Sotiris B., Gravvanis, George A., and Vrahatis, Michael N.
- Subjects
ROTATIONAL motion ,DATA mining - Abstract
During the last decade, a variety of ensembles methods has been developed. All known and widely used methods of this category produce and combine different learners utilizing the same algorithm as the basic classifiers. In the present study, we use two well-known approaches, namely, Rotation Forest and Random Subspace, in order to increase the effectiveness of a single learning algorithm. We have conducted experiments with other well-known ensemble methods, with 25 sub-classifiers, in order to test the proposed model. The experimental study that we have conducted is based on 35 various datasets. According to the Friedman test, the Rotation Forest of Random Subspace C4.5 (RFRS C4.5) and the PART (RFRS PART) algorithms exhibit the best scores in our resulting ranking. Our results have shown that the proposed method exhibits competitive performance and better accuracy in most of the cases. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
32. 结合模糊聚类的多示例集成算法.
- Author
-
韩海韵, 杨有龙, and 孙丽芹
- Subjects
FUZZY algorithms ,PHARMACODYNAMICS ,PROBLEM solving ,AMBIGUITY ,FUZZY clustering technique ,ALGORITHMS - Abstract
Copyright of Journal of Computer Engineering & Applications is the property of Beijing Journal of Computer Engineering & Applications Journal Co Ltd. and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2022
- Full Text
- View/download PDF
33. Genetically Optimized Ensemble Classifiers for Multiclass Student Performance Prediction.
- Author
-
Begum, Safira and Padmannavar, Sunita S.
- Subjects
DATA mining ,SCHOOL dropouts ,MACHINE learning ,GENETIC algorithms ,GENETIC models - Abstract
The knowledge obtained from data can be useful for the improvement of education systems, giving rise to a research space called Educational Data Mining (EDM). EDM covers the development of methods to explore information collected from educational environments, allowing to understand students more effectively and adequately, providing better educational benefits to them. Machine learning (ML) technologies are growing considerably in recent years. The field of data mining in education provides researchers and educators with metrics of success, failure, dropout, and more, allowing students to guess. The main reason for dropping out of school is not studying. Several researchers have proposed various educational data mining techniques to predict student performance and analyzed the techniques found in educational datasets. This paper proposes a student predictive model with the use of ensemble classifiers. Initially data is pre-processed and an analysis of the correlation between the entrance attributes was carried out to identify the existence of possible redundancies between them, resulting from a very high positive correlation. The filtered attribute is trained and tested with Boosting, Bagging and Random subspace classifiers. Further to improve the accuracy of predictive model genetic algorithm is applied on three classifiers. Genetic Algorithm is an approach used to find optimized solution to search problems and it intend to increase the probability of solving the problem. The process of optimization involves selection of the best option from the available set of options to achieve the desired goal. Selection is done such that the efficiency can be maximized and error can be minimized. An analysis of the correlation between the entrance attributes was carried out to identify the existence of possible redundancies between them, resulting from a very high positive correlation. There is significant improvement in classifier accuracy, when tested mathematic and Portuguese data i.e. 3 % and 11% respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
34. Fuzzy Clustering with Ensemble Classification Techniques to Improve the Customer Churn Prediction in Telecommunication Sector
- Author
-
Vijaya, J., Sivasankar, E., Gayathri, S., Kacprzyk, Janusz, Series Editor, Pal, Nikhil R., Advisory Editor, Bello Perez, Rafael, Advisory Editor, Corchado, Emilio S., Advisory Editor, Hagras, Hani, Advisory Editor, Kóczy, László T., Advisory Editor, Kreinovich, Vladik, Advisory Editor, Lin, Chin-Teng, Advisory Editor, Lu, Jie, Advisory Editor, Melin, Patricia, Advisory Editor, Nedjah, Nadia, Advisory Editor, Nguyen, Ngoc Thanh, Advisory Editor, Wang, Jun, Advisory Editor, Kalita, Jugal, editor, Balas, Valentina Emilia, editor, Borah, Samarjeet, editor, and Pradhan, Ratika, editor
- Published
- 2019
- Full Text
- View/download PDF
35. Machine Learning Methods for Fake News Classification
- Author
-
Ksieniewicz, Paweł, Choraś, Michał, Kozik, Rafał, Woźniak, Michał, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Yin, Hujun, editor, Camacho, David, editor, Tino, Peter, editor, Tallón-Ballesteros, Antonio J., editor, Menezes, Ronaldo, editor, and Allmendinger, Richard, editor
- Published
- 2019
- Full Text
- View/download PDF
36. Combining Random Subspace Approach with smote Oversampling for Imbalanced Data Classification
- Author
-
Ksieniewicz, Pawel, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Pérez García, Hilde, editor, Sánchez González, Lidia, editor, Castejón Limas, Manuel, editor, Quintián Pardo, Héctor, editor, and Corchado Rodríguez, Emilio, editor
- Published
- 2019
- Full Text
- View/download PDF
37. Digital Watermark Extraction Using RS-KNN and RS-LDA with LWT and Statistical Features
- Author
-
Jaiswal, Sushma and Pandey, Manoj Kumar
- Published
- 2023
- Full Text
- View/download PDF
38. Modeling groundwater potential using novel GIS-based machine-learning ensemble techniques
- Author
-
Alireza Arabameri, Subodh Chandra Pal, Fatemeh Rezaie, Omid Asadi Nalivan, Indrajit Chowdhuri, Asish Saha, Saro Lee, and Hossein Moayedi
- Subjects
Groundwater ,Machine learning ,Random subspace ,Ensemble models ,RS-GIS ,Iran ,Physical geography ,GB3-5030 ,Geology ,QE1-996.5 - Abstract
Study region: The present study has been carried out in the Tabriz River basin (5397 km2) in north-western Iran. Elevations vary from 1274 to 3678 m above sea level, and slope angles range from 0 to 150.9 %. The average annual minimum and maximum temperatures are 2 °C and 12 °C, respectively. The average annual rainfall ranges from 243 to 641 mm, and the northern and southern parts of the basin receive the highest amounts. Study focus: In this study, we mapped the groundwater potential (GWP) with a new hybrid model combining random subspace (RS) with the multilayer perception (MLP), naïve Bayes tree (NBTree), and classification and regression tree (CART) algorithms. A total of 205 spring locations were collected by integrating field surveys with data from Iran Water Resources Management, and divided into 70:30 for training and validation. Fourteen groundwater conditioning factors (GWCFs) were used as independent model inputs. Statistics such as receiver operating characteristic (ROC) and five others were used to evaluate the performance of the models. New hydrological insights for the region: The results show that all models performed well for GWP mapping (AUC > 0.8). The hybrid MLP-RS model achieved high validation scores (AUC = 0.935). The relative importance of GWCFs was revealed that slope, elevation, TRI and HAND are the most important predictors of groundwater presence. This study demonstrates that hybrid ensemble models can support sustainable management of groundwater resources.
- Published
- 2021
- Full Text
- View/download PDF
39. Landslide susceptibility mapping using an ensemble model of Bagging scheme and random subspace–based naïve Bayes tree in Zigui County of the Three Gorges Reservoir Area, China.
- Author
-
Hu, Xudong, Huang, Cheng, Mei, Hongbo, and Zhang, Han
- Subjects
- *
LANDSLIDES , *LANDSLIDE hazard analysis , *LANDSLIDE prediction , *GORGES , *SUPPORT vector machines , *PEARSON correlation (Statistics) , *RANDOM forest algorithms - Abstract
A novel machine learning ensemble model that is a hybridization of Bagging and random subspace–based naïve Bayes tree (RSNBtree), named as BRSNBtree, was used to prepare a landslide susceptibility map for Zigui County of the Three Gorges Reservoir Area, China. The proposed method is implemented by using the Bagging scheme to integrate the base-level RSNBtree model. To predict landslide susceptibility for the study area, a spatial database consisted of 807 landslides and 11 conditioning factors has been prepared. Evaluation of conditioning factors was conducted using the Pearson correlation coefficient and Relief-F method. The results indicate that all factors except the topographic wetness index can be accepted as modeling inputs. Particularly, the distance to rivers is the most important factor in landslide susceptibility prediction. The performance of landslide models was evaluated using statistical indices and areas under the receiver operatic characteristic curve (AUC). The support vector machines (SVM) and random forest (RF) were adopted for the comparison with our methods. Results show that the BRSNBtree (AUC = 0.968) achieves the highest prediction performance, which successfully refines the RSNBtree (AUC = 0.938) and outperforms the RF (AUC = 0.949) and SVM (AUC = 0.895). Therefore, the proposed BRSNBtree presents advantages in targeting landslide susceptible areas and provides a promising method for landslide susceptibility assessment. The developed susceptibility maps could facilitate effective landslide risk management for this landslide-prone area. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
40. Ensemble machine learning models based on Reduced Error Pruning Tree for prediction of rainfall-induced landslides.
- Author
-
Pham, Binh Thai, Jaafari, Abolfazl, Nguyen-Thoi, Trung, Van Phong, Tran, Nguyen, Huu Duy, Satyam, Neelima, Masroor, Md, Rehman, Sufia, Sajjad, Haroon, Sahana, Mehebub, Van Le, Hiep, and Prakash, Indra
- Subjects
LANDSLIDE prediction ,TREE pruning ,MACHINE learning ,LANDSLIDE hazard analysis ,RECEIVER operating characteristic curves ,LANDSLIDES ,STANDARD deviations - Abstract
In this paper, we developed highly accurate ensemble machine learning models integrating Reduced Error Pruning Tree (REPT) as a base classifier with the Bagging (B), Decorate (D), and Random Subspace (RSS) ensemble learning techniques for spatial prediction of rainfall-induced landslides in the Uttarkashi district, located in the Himalayan range, India. To do so, a total of 103 historical landslide events were linked to twelve conditioning factors for generating training and validation datasets. Root Mean Square Error (RMSE) and Area Under the receiver operating characteristic Curve (AUC) were used to evaluate the training and validation performances of the models. The results showed that the single REPT model and its derived ensembles provided a satisfactory accuracy for the prediction of landslides. The D-REPT model with RMSE = 0.351 and AUC = 0.907 was identified as the most accurate model, followed by RSS-REPT (RMSE = 0.353 and AUC = 0.898), B-REPT (RMSE = 0.396 and AUC = 0.876), and the single REPT model (RMSE = 0.398 and AUC = 0.836), respectively. The prominent ensemble models proposed and verified in this study provide engineers and modelers with insights for development of more advanced predictive models for different landslide-susceptible areas around the world. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
41. Anti-cross validation technique for constructing and boosting random subspace neural network ensembles for hyperspectral image classification.
- Author
-
Eeti, Laxmi Narayana and Buddhiraju, Krishna Mohan
- Subjects
- *
FEATURE selection , *DATA mining , *CLASSIFICATION - Abstract
Achieving high classification accuracy is vital in reliable information extraction from images. Single classifiers and existing ensemble methods suffer from data dimensionality, insufficient ground truth information and lack in defining optimal feature selection. This article presents a novel idea for constructing component classifiers that boost random subspace ensemble method in improving its classification performance. It is achieved through sub-optimal training of component classifiers through interference in training process during validation error evaluation. The new approach allows to enforce different class errors among component classifiers, besides improving individual class accuracy. This article demonstrates effectiveness of the anti-cross validation approach using three classical hyperspectral Image (HSI) datasets with significant improvement in classification accuracies from 3 to 10% with the proposed approach. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
42. Random Subspace Ensemble Learning for Functional Near-Infrared Spectroscopy Brain-Computer Interfaces
- Author
-
Jaeyoung Shin
- Subjects
brain-computer interface ,ensemble learning ,functional near-infrared spectroscopy ,linear discriminant analysis ,random subspace ,support vector machine ,Neurosciences. Biological psychiatry. Neuropsychiatry ,RC321-571 - Abstract
The feasibility of the random subspace ensemble learning method was explored to improve the performance of functional near-infrared spectroscopy-based brain-computer interfaces (fNIRS-BCIs). Feature vectors have been constructed using the temporal characteristics of concentration changes in fNIRS chromophores such as mean, slope, and variance to implement fNIRS-BCIs systems. The mean and slope, which are the most popular features in fNIRS-BCIs, were adopted. Linear support vector machine and linear discriminant analysis were employed, respectively, as a single strong learner and multiple weak learners. All features in every channel and available time window were employed to train the strong learner, and the feature subsets were selected at random to train multiple weak learners. It was determined that random subspace ensemble learning is beneficial to enhance the performance of fNIRS-BCIs.
- Published
- 2020
- Full Text
- View/download PDF
43. Maximum Likelihood Estimation Based on Random Subspace EDA: Application to Extrasolar Planet Detection
- Author
-
Liu, Bin, Chen, Ke-Jia, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Shi, Yuhui, editor, Tan, Kay Chen, editor, Zhang, Mengjie, editor, Tang, Ke, editor, Li, Xiaodong, editor, Zhang, Qingfu, editor, Tan, Ying, editor, Middendorf, Martin, editor, and Jin, Yaochu, editor
- Published
- 2017
- Full Text
- View/download PDF
44. A Simple Tool for Bounding the Deviation of Random Matrices on Geometric Sets
- Author
-
Liaw, Christopher, Mehrabian, Abbas, Plan, Yaniv, Vershynin, Roman, Morel, Jean-Michel, Editor-in-chief, Brion, Michel, Series editor, Teissier, Bernard, Editor-in-chief, De Lellis, Camillo, Series editor, Di Bernardo, Mario, Series editor, Figalli, Alessio, Series editor, Khoshnevisan, Davar, Series editor, Kontoyiannis, Ioannis, Series editor, Lugosi, Gábor, Series editor, Podolskij, Mark, Series editor, Serfaty, Sylvia, Series editor, Wienhard, Anna, Series editor, Klartag, Bo'az, editor, and Milman, Emanuel, editor
- Published
- 2017
- Full Text
- View/download PDF
45. 结合随机子空间和级联残差网络的缺陷检测.
- Author
-
金闳奇, 陈新度, and 吴磊
- Abstract
Copyright of Journal of Computer Engineering & Applications is the property of Beijing Journal of Computer Engineering & Applications Journal Co Ltd. and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2020
- Full Text
- View/download PDF
46. Monthly suspended sediment load prediction using artificial intelligence: testing of a new random subspace method.
- Author
-
Nhu, Viet-Ha, Khosravi, Khabat, Cooper, James R., Karimi, Mahshid, Kisi, Ozgur, Pham, Binh Thai, and Lyu, Zongjie
- Subjects
- *
SUSPENDED sediments , *INTELLIGENCE tests , *FORECASTING , *ARTIFICIAL intelligence , *RADIAL basis functions - Abstract
The predictive capability of a new artificial intelligence method, random subspace (RS), for the prediction of suspended sediment load in rivers was compared with commonly used methods: random forest (RF) and two support vector machine (SVM) models using a radial basis function kernel (SVM-RBF) and a normalized polynomial kernel (SVM-NPK). Using river discharge, rainfall and river stage data from the Haraz River, Iran, the results revealed: (a) the RS model provided a superior predictive accuracy (NSE = 0.83) to SVM-RBF (NSE = 0.80), SVM-NPK (NSE = 0.78) and RF (NSE = 0.68), corresponding to very good, good, satisfactory and unsatisfactory accuracies in load prediction; (b) the RBF kernel outperformed the NPK kernel; (c) the predictive capability was most sensitive to gamma and epsilon in SVM models, maximum depth of a tree and the number of features in RF models, classifier type, number of trees and subspace size in RS models; and (d) suspended sediment loads were most closely correlated with river discharge (PCC = 0.76). Overall, the results show that RS models have great potential in data poor watersheds, such as that studied here, to produce strong predictions of suspended load based on monthly records of river discharge, rainfall depth and river stage alone. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
47. Random Subspace Ensemble Learning for Functional Near-Infrared Spectroscopy Brain-Computer Interfaces.
- Author
-
Shin, Jaeyoung
- Subjects
BRAIN-computer interfaces ,FISHER discriminant analysis ,SUPPORT vector machines ,SPECTROMETRY - Abstract
The feasibility of the random subspace ensemble learning method was explored to improve the performance of functional near-infrared spectroscopy-based brain-computer interfaces (fNIRS-BCIs). Feature vectors have been constructed using the temporal characteristics of concentration changes in fNIRS chromophores such as mean, slope, and variance to implement fNIRS-BCIs systems. The mean and slope, which are the most popular features in fNIRS-BCIs, were adopted. Linear support vector machine and linear discriminant analysis were employed, respectively, as a single strong learner and multiple weak learners. All features in every channel and available time window were employed to train the strong learner, and the feature subsets were selected at random to train multiple weak learners. It was determined that random subspace ensemble learning is beneficial to enhance the performance of fNIRS-BCIs. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
48. 基于改进的Random Subspace均择户投分类方法.
- Author
-
杨颖, 王琚, and 王刚
- Abstract
Copyright of Journal of Computer Engineering & Applications is the property of Beijing Journal of Computer Engineering & Applications Journal Co Ltd. and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2020
- Full Text
- View/download PDF
49. A random subspace based conic functions ensemble classifier.
- Author
-
ÇİMEN, Emre
- Subjects
- *
CLASSIFICATION algorithms , *HIGH-dimensional model representation , *ALGORITHMS , *SUPPORT vector machines , *LINEAR programming , *CONIC sections - Abstract
Classifiers overfit when the data dimensionality ratio to the number of samples is high in a dataset. This problem makes a classification model unreliable. When the overfitting problem occurs, one can achieve high accuracy in the training; however, test accuracy occurs significantly less than training accuracy. The random subspace method is a practical approach to overcome the overfitting problem. In random subspace methods, the classification algorithm selects a random subset of the features and trains a classifier function trained with the selected features. The classification algorithm repeats the process multiple times, and eventually obtains an ensemble of classifier functions. Conic functions based classifiers achieve high performance in the literature; however, these classifiers cannot overcome the overfitting problem when it is the case data dimensionality ratio to the number of samples is high. The proposed method fills the gap in the conic functions classifiers related literature. In this study, we combine the random subspace method and a novel conic function based classifier algorithm. We present the computational results by comparing the new approach with a wide range of models in the literature. The proposed method achieves better results than the previous implementations of conic function based classifiers and can compete with the other well-known methods. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
50. Soft Computing Ensemble Models Based on Logistic Regression for Groundwater Potential Mapping.
- Author
-
Nguyen, Phong Tung, Ha, Duong Hai, Avand, Mohammadtaghi, Jaafari, Abolfazl, Nguyen, Huu Duy, Al-Ansari, Nadhir, Van Phong, Tran, Sharma, Rohit, Kumar, Raghvendra, Le, Hiep Van, Ho, Lanh Si, Prakash, Indra, and Pham, Binh Thai
- Subjects
SOFT computing ,LOGISTIC regression analysis ,STANDARD deviations ,RECEIVER operating characteristic curves ,GROUNDWATER management - Abstract
Groundwater potential maps are one of the most important tools for the management of groundwater storage resources. In this study, we proposed four ensemble soft computing models based on logistic regression (LR) combined with the dagging (DLR), bagging (BLR), random subspace (RSSLR), and cascade generalization (CGLR) ensemble techniques for groundwater potential mapping in Dak Lak Province, Vietnam. A suite of well yield data and twelve geo-environmental factors (aspect, elevation, slope, curvature, Sediment Transport Index, Topographic Wetness Index, flow direction, rainfall, river density, soil, land use, and geology) were used for generating the training and validation datasets required for the building and validation of the models. Based on the area under the receiver operating characteristic curve (AUC) and several other validation methods (negative predictive value, positive predictive value, root mean square error, accuracy, sensitivity, specificity, and Kappa), it was revealed that all four ensemble learning techniques were successful in enhancing the validation performance of the base LR model. The ensemble DLR model (AUC = 0.77) was the most successful model in identifying the groundwater potential zones in the study area, followed by the RSSLR (AUC = 0.744), BLR (AUC = 0.735), CGLR (AUC = 0.715), and single LR model (AUC = 0.71), respectively. The models developed in this study and the resulting potential maps can assist decision-makers in the development of effective adaptive groundwater management plans. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.