1. 一种融合D_BBAS方法的重复缺陷报告检测.
- Author
-
曾 方, 谢 琪, and 崔梦天
- Subjects
- *
ARTIFICIAL neural networks , *COMPUTER software development , *MAINTENANCE costs , *OFFICES , *FEATURE extraction , *INSTITUTIONAL repositories - Abstract
Developers in large software development environments rely on bug reports to complete fixes. Since reporters may use different representations to describe the same bug due to different expression habits, automated detection of duplicate bug reports can reduce development redundancy as well as maintenance costs. Recent detection of repetitive bug reports tends to use deep neural networks and considers both structured and unstructured information to generate hybrid representation features. In order to obtain the features of unstructured information of bug reports more effectively, this paper proposes a D_BBAS(Doc2 Vec and BERT BiLSTM-Attention Similarity) method, which trains a feature extraction model based on a large-scale bug report library to generate a bug summary text representation set and a bug description text representation set that can reflect deep semantic information. Then, these two distributed representation sets compute the similarity of bug report pairs, resulting in two new similarity features. These two new features will participate in the detection of duplicate bug reports when combined with the traditional features generated based on structured information. In this paper, the effectiveness of the proposed approach is verified on the bug report repositories of well-known open-source projects Eclipse, NetBeans and Open Office, which contain more than 500, 000 bug reports. The experimental results show that compared with the representative methods, the method in this paper improves the F1 value by 1.7% on average, which proves the effectiveness of the method in this paper. [ABSTRACT FROM AUTHOR]
- Published
- 2022