Back to Search Start Over

Exploring True Test Overfitting in Dynamic Automated Program Repair using Formal Methods

Authors :
David R. Cok
Xuan-Bach D. Le
Corina S. Păsăreanu
Amirfarhad Nilizadeh
Gary T. Leavens
Source :
ICST
Publication Year :
2021
Publisher :
Zenodo, 2021.

Abstract

Automated program repair (APR) techniques have shown a promising ability to generate patches that fix program bugs automatically. Typically such APR tools are dynamicin the sense that they find bugs by testing and they validate patches by running a program's test suite. Patches can also be validated manually. However, neither of these methods for validating patches can truly tell whether a patch is correct. Test suites are usually incomplete, and thus APR-generated patches may pass the tests but not be truly correct; in other words, the APR tools may be overfittingto the tests. The possibility of test overfitting leads to manual validation, which is costly, potentially biased, and can also be incomplete. Therefore, we must move past these methods to truly assess APR's overfitting problem. We aim to evaluate the test overfitting problem in dynamic APR tools using ground truth given by a set of programs equipped with formal behavioral specifications. Using these formal specifications and an automated verification tool, we found that there is definitely overfitting in the generated patches of seven well-studied APR tools, although many (about 59%) of the generated patches were indeed correct. Our study further points out two new problems that can affect APR tools: changes to the complexity of programs and numeric problems.An additional contribution is that we introduce the first publicly available data set of formally specified and verified Java programs, their test suites, and buggy variants, each of which has exactly one bug. &nbsp

Details

Database :
OpenAIRE
Journal :
ICST
Accession number :
edsair.doi.dedup.....fbea6e3df315af725a01876ef82505ac
Full Text :
https://doi.org/10.5281/zenodo.4670465