1. A novel methodology to classify test cases using natural language processing and imbalanced learning.
- Author
-
Tahvili, Sahar, Hatvani, Leo, Ramentol, Enislay, Pimentel, Rita, Afzal, Wasif, and Herrera, Francisco
- Subjects
- *
NATURAL language processing , *LEARNING , *SUPERVISED learning , *COMPUTER software testing , *VECTOR data , *TEST systems - Abstract
Detecting the dependency between integration test cases plays a vital role in the area of software test optimization. Classifying test cases into two main classes – dependent and independent – can be employed for several test optimization purposes such as parallel test execution, test automation, test case selection and prioritization, and test suite reduction. This task can be seen as an imbalanced classification problem due to the test cases' distribution. Often the number of dependent and independent test cases is uneven, which is related to the testing level, testing environment and complexity of the system under test. In this study, we propose a novel methodology that consists of two main steps. Firstly, by using natural language processing we analyze the test cases' specifications and turn them into a numeric vector. Secondly, by using the obtained data vectors, we classify each test case into a dependent or an independent class. We carry out a supervised learning approach using different methods for handling imbalanced datasets. The feasibility and possible generalization of the proposed methodology is evaluated in two industrial projects at Bombardier Transportation, Sweden, which indicates promising results. • In a manual testing procedure, all testing artifacts are written in a natural text, employing natural language processing techniques might provide highly useful information for test optimization purposes. • The ratio of dependent and independent test cases might suffer from an imbalanced distribution due to the testing level and complexity of the system under test. • Doc2Vec proves to be a good tool when transforming the manual test cases into feature vectors. • IFROWANN performs well when splitting dependent and independent test cases as an imbalance learning algorithm. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF