Back to Search Start Over

Software Defect Prediction Using Deep Q-Learning Network-Based Feature Extraction

Authors :
Qinhe Zhang
Jiachen Zhang
Tie Feng
Jialang Xue
Xinxin Zhu
Ningyang Zhu
Zhiheng Li
Source :
IET Software, Vol 2024 (2024)
Publication Year :
2024
Publisher :
Hindawi-IET, 2024.

Abstract

Machine learning-based software defect prediction (SDP) approaches have been commonly proposed to help to deliver high-quality software. Unfortunately, all the previous research conducted without effective feature reduction suffers from high-dimensional data, leading to unsatisfactory prediction performance measures. Moreover, without proper feature reduction, the interpretability and generalization ability of machine learning models in SDP may be compromised, hindering their practical utility in diverse software development environments. In this paper, an SDP approach using deep Q-learning network (DQN)-based feature extraction is proposed to eliminate irrelevant, redundant, and noisy features and improve the classification performance. In the data preprocessing phase, the undersampling method of BalanceCascade is applied to divide the original datasets. As the first step of feature extraction, the weight ranking of all the metric elements is calculated according to the expected cross-entropy. Then, the relation matrix is constructed by applying random matrix theory. After that, the reward principle is defined for computing the Q value of Q-learning based on weight ranking, relation matrix, and the number of errors, according to which a convolutional neural network model is trained on datasets until the sequences of metric pairs are generated for all datasets acting as the revised feature set. Various experiments have been conducted on 11 NASA and 11 PROMISE repository datasets. Sensitive analysis experiments show that binary classification algorithms based on SDP approaches using the DQN-based feature extraction outperform those without using it. We also conducted experiments to compare our approach with four state-of-the-art approaches on common datasets, which show that our approach is superior to these methods in precision, F-measure, area under receiver operating characteristics curve, and Matthews correlation coefficient values.

Subjects

Subjects :
Computer software
QA76.75-76.765

Details

Language :
English
ISSN :
17518814
Volume :
2024
Database :
Directory of Open Access Journals
Journal :
IET Software
Publication Type :
Academic Journal
Accession number :
edsdoj.111c6d68377340e0a4b4eb152c8ae8e0
Document Type :
article
Full Text :
https://doi.org/10.1049/2024/3946655