Back to Search
Start Over
Combining CNN with DS3for Detecting Bug-prone Modules in Cross-version Projects
- Source :
- SEAA
- Publication Year :
- 2021
- Publisher :
- Institute of Electrical and Electronics Engineers Inc., 2021.
-
Abstract
- The paper focuses on Cross-Version Defect Prediction (CVDP) where the classification model is trained on information of the prior version and then tested to predict defects in the components of the last release. To avoid the distribution differences which could negatively impact the performances of machine learning based model, we consider Dissimilarity-based Sparse Subset Selection (DS3) technique for selecting meaningful representatives to be included in the training set. Furthermore, we employ a Convolutional Neural Network (CNN) to generate structural and semantic features to be merged with the traditional software measures to obtain a more comprehensive list of predictors. To evaluate the usefulness of our proposal for the CVDP scenario, we perform an empirical study on a total of 20 cross-version pairs from 10 different software projects. To build prediction models we consider Logistic Regression (LR) and Random Forest (RF) and we adopt 3 evaluation criteria (i.e., F-measure, G-mean, Balance) to assess the prediction accuracy. Our results show that the use of CNN with both LR and RF models has a significant impact, with an improvement of ∼20% for each evaluation criteria. Differently, we notice that DS3 does not impact significantly in improving prediction accuracy.
- Subjects :
- Training set
Classification Models
CNN
Cross-Version Defect Prediction
Semantic and Structural Features
Computer science
business.industry
Logistic regression
Machine learning
computer.software_genre
Convolutional neural network
Random forest
Software
Empirical research
Artificial intelligence
business
computer
Predictive modelling
Selection (genetic algorithm)
Subjects
Details
- Language :
- English
- Database :
- OpenAIRE
- Journal :
- SEAA
- Accession number :
- edsair.doi.dedup.....f7aacc10e580c2d3eb01f14f7ad77e08