1. N-Version Assessment and Enhancement of Generative AI
- Author
-
Kessel, Marcus and Atkinson, Colin
- Subjects
Computer Science - Software Engineering ,Computer Science - Artificial Intelligence ,D.2.1 ,D.2.4 ,I.2.2 ,I.2.7 - Abstract
Generative AI (GAI) holds great potential to improve software engineering productivity, but its untrustworthy outputs, particularly in code synthesis, pose significant challenges. The need for extensive verification and validation (V&V) of GAI-generated artifacts may undermine the potential productivity gains. This paper proposes a way of mitigating these risks by exploiting GAI's ability to generate multiple versions of code and tests to facilitate comparative analysis across versions. Rather than relying on the quality of a single test or code module, this "differential GAI" (D-GAI) approach promotes more reliable quality evaluation through version diversity. We introduce the Large-Scale Software Observatorium (LASSO), a platform that supports D-GAI by executing and analyzing large sets of code versions and tests. We discuss how LASSO enables rigorous evaluation of GAI-generated artifacts and propose its application in both software development and GAI research., Comment: This work has been accepted for publication in an upcoming issue of IEEE Software. This work has been submitted to the IEEE for possible publication
- Published
- 2024
- Full Text
- View/download PDF