Back to Search Start Over

Deep learning-aided automated personal data discovery and profiling.

Authors :
YAYIK, Apdullah
AYBAR, Vedat
APİK, Hasan
İÇÖZ, Sevcan
BAKAR, Bekir
GÜNGÖR, Tunga
Source :
Turkish Journal of Electrical Engineering & Computer Sciences; 2022, Vol. 30 Issue 1, p167-183, 17p
Publication Year :
2022

Abstract

In Turkey, Turkish Personal Data Protection Rule (PDPR) No. 6698, in force since 2016, provides protection to citizens for the legal existence of their personal data. Although the law provides excellent guidance, companies currently face challenges in complying with its regulations in terms of storing, sharing, or monitoring personal data. Since any specially designed software with wide industrial usage is not on the market, almost all of the companies have no other choice but to take expensive and error-prone operations manually to ensure their compliance. In this paper, we present an automated solution to facilitate and accelerate PDPR compliance. In a structured or unstructured document, words or phrases that could include personal data entities are tagged with the help of a Bi-LSTM based named entity recognition model and a rule-based matching component that employs contextual analysis. To find associations in personal data and obtain individual personal profiles, these entities are divided into categories according to their confidence levels. Personal profiles are constructed using an approach similar to clustering. It treats the personal data categories with high identification levels as separate clusters and finds related personal data entities at the left and/or right of its contexts. We evaluated the system on a data set formed of 70 documents of different types and personal data entities. We obtained 91.76 % micro-averaged F1-measure for personal data entity classification and 72.46 % accuracy for profile extraction. We also performed experiments related to the performance of the named entity recognition and to the time complexity of the overall system on a data set formed of 33K documents. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
13000632
Volume :
30
Issue :
1
Database :
Complementary Index
Journal :
Turkish Journal of Electrical Engineering & Computer Sciences
Publication Type :
Academic Journal
Accession number :
154835422
Full Text :
https://doi.org/10.3906/elk-2102-54