Back to Search Start Over

Using Data Visualization Technique to Detect Sensitive Information Re-Identification Problem of Real Open Dataset

Authors :
Chiun-How Kao
Yu-Ting Kuang
Chih-Hung Hsieh
Yu-Feng Chu
Source :
2016 International Computer Symposium (ICS).
Publication Year :
2016
Publisher :
IEEE, 2016.

Abstract

Opening data of plenty valuable information as public dataset provides great potential treasure to academy or industry. Despite of de-identification process that most of data owner will take before releasing those data, however, the more datasets are opened to public, the more likely personal privacy exposed will be. Previous studies have shown that personal identity and sensitive information might be re-identified by joining two or more de-identified data table with common attributes. According to previous real case studies, even though the personally identifiable information have been de-identified, sensitive personal information still could be uncovered by heterogeneous or cross-domain data joining operation. This kind of privacy re-identification are usually too complicated or obscure to be realized by data owner, not to mention that this problem will be more severe as the scale of data goes large. For the purpose of preventing damage of sensitive information leakage, this paper shows how to use a novel open data de-identification visualization analysis tool (ODD Visualizer) to verify whether there exists sensitive information leakage problem in the target datasets. The high effectiveness, that ODD Visualizer can provide, mainly comes from implementing scalable computing platform as well as developing efficient data visualization technique. Demonstration proves that ODD Visualizer indeed uncovered one real vulnerability of record linkage attack among open datasets available on the internet.

Details

Database :
OpenAIRE
Journal :
2016 International Computer Symposium (ICS)
Accession number :
edsair.doi...........b3408d9dce71031507c35aa714d8e287
Full Text :
https://doi.org/10.1109/ics.2016.0073