Back to Search Start Over

Towards long term data quality in a large scale biometrics experiment

Authors :
Douglas Thain
Hoang Bui
Patrick J. Flynn
Clarence Helm
Diane Wright
Rachel Witty
Source :
HPDC
Publication Year :
2010
Publisher :
ACM, 2010.

Abstract

Quality of data plays a very important role in any scientific research. In this paper we present some of the challenges that we face in managing and maintaining data quality for a terabyte scale biometrics repository. We have developed a step by step model to capture, ingest, validate, and prepare data for biometrics research. During these processes, there are many hidden errors which can be introduced into the data. Those errors can affect the overall quality of data, and thus can skew the results of biometrics research. We discuss necessary steps we have taken to reduce and eliminate the errors. Steps such as data replication, automated data validation, and logging metadata changes are both necessary and crucial to improve the quality and reliability of our data.

Details

Database :
OpenAIRE
Journal :
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
Accession number :
edsair.doi...........422cfbc85975a436852a20f1cc723bdb