Back to Search
Start Over
Using Data Visualization Technique to Detect Sensitive Information Re-Identification Problem of Real Open Dataset
- Source :
- 2016 International Computer Symposium (ICS).
- Publication Year :
- 2016
- Publisher :
- IEEE, 2016.
-
Abstract
- Opening data of plenty valuable information as public dataset provides great potential treasure to academy or industry. Despite of de-identification process that most of data owner will take before releasing those data, however, the more datasets are opened to public, the more likely personal privacy exposed will be. Previous studies have shown that personal identity and sensitive information might be re-identified by joining two or more de-identified data table with common attributes. According to previous real case studies, even though the personally identifiable information have been de-identified, sensitive personal information still could be uncovered by heterogeneous or cross-domain data joining operation. This kind of privacy re-identification are usually too complicated or obscure to be realized by data owner, not to mention that this problem will be more severe as the scale of data goes large. For the purpose of preventing damage of sensitive information leakage, this paper shows how to use a novel open data de-identification visualization analysis tool (ODD Visualizer) to verify whether there exists sensitive information leakage problem in the target datasets. The high effectiveness, that ODD Visualizer can provide, mainly comes from implementing scalable computing platform as well as developing efficient data visualization technique. Demonstration proves that ODD Visualizer indeed uncovered one real vulnerability of record linkage attack among open datasets available on the internet.
- Subjects :
- Information privacy
business.industry
Computer science
05 social sciences
computer.software_genre
01 natural sciences
Data science
Visualization
010104 statistics & probability
Open data
Information sensitivity
Data visualization
0502 economics and business
The Internet
Data mining
0101 mathematics
business
Personally identifiable information
computer
050205 econometrics
Vulnerability (computing)
Subjects
Details
- Database :
- OpenAIRE
- Journal :
- 2016 International Computer Symposium (ICS)
- Accession number :
- edsair.doi...........b3408d9dce71031507c35aa714d8e287
- Full Text :
- https://doi.org/10.1109/ics.2016.0073