Back to Search Start Over

Synthetic Data Generation for Statistical Testing

Authors :
Interdisciplinary Centre for Security, Reliability and Trust (SnT) > Software Verification and Validation Lab (SVV Lab) [research center]
Soltana, Ghanem
Sabetzadeh, Mehrdad
Briand, Lionel
Interdisciplinary Centre for Security, Reliability and Trust (SnT) > Software Verification and Validation Lab (SVV Lab) [research center]
Soltana, Ghanem
Sabetzadeh, Mehrdad
Briand, Lionel
Publication Year :
2017

Abstract

Usage-based statistical testing employs knowledge about the actual or anticipated usage profile of the system under test for estimating system reliability. For many systems, usage-based statistical testing involves generating synthetic test data. Such data must possess the same statistical characteristics as the actual data that the system will process during operation. Synthetic test data must further satisfy any logical validity constraints that the actual data is subject to. Targeting data-intensive systems, we propose an approach for generating synthetic test data that is both statistically representative and logically valid. The approach works by first generating a data sample that meets the desired statistical characteristics, without taking into account the logical constraints. Subsequently, the approach tweaks the generated sample to fix any logical constraint violations. The tweaking process is iterative and continuously guided toward achieving the desired statistical characteristics. We report on a realistic evaluation of the approach, where we generate a synthetic population of citizens' records for testing a public administration IT system. Results suggest that our approach is scalable and capable o

Details

Database :
OAIster
Notes :
English
Publication Type :
Electronic Resource
Accession number :
edsoai.on1139861300
Document Type :
Electronic Resource