38 results on '"Data anonymity"'
Search Results
2. Apache Flink and clustering-based framework for fast anonymization of IoT stream data
- Author
-
Alireza Sadeghi-Nasab, Hossein Ghaffarian, and Mohsen Rahmani
- Subjects
Internet of Things ,Data privacy ,Streaming data ,Data anonymity ,Apache Flink ,Data processing engine ,Cybernetics ,Q300-390 ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
In this paper, we present a novel framework that considers the expiration period time of the Internet of Things (IoT) data stream to anonymize it. IoT stands among one of most fast-growing technology in the world. Also, anonymity is one of the safeguards in place to protect data privacy. Because of the dynamic nature, vastness, and rapid changes in data streams, traditional approaches cannot be used to anonymize IoT data. The anonymization framework proposed in this paper performs its operation using a new clustering method and Apache Flink flow data processing engine. In this framework, firstly, we cluster received data. Then, if the size of the clusters doesn't meet the K-anonymity threshold, our review will continue to suppress and delete them; otherwise, the data would be anonymized and published. In this way, the framework handles both numerical and categorical data. At the end of the stream, the final remaining data will be merged and anonymized. Implementing and evaluating the framework using Scala and Apache Flink shows that the proposed approach reduces data delay by 12.33–66.62% compared with the other methods. Furthermore, in the end, combining the leftover clusters avoids information loss. In comparison with similar methods, information loss is reduced by 5.68–18.26%. The evaluation results show better performance in terms of data delay and information loss.
- Published
- 2023
- Full Text
- View/download PDF
3. Data and Location Privacy of Smart Devices over Vehicular Cloud Computing
- Author
-
Hani Al-Balasmeh, Maninder Singh, and Raman Singh
- Subjects
location privacy ,cryptography ,data anonymity ,vehicular cloud computing ,Telecommunication ,TK5101-6720 - Abstract
In this paper, we have addressed the problem of data and location privacy in smart devices over vehicular cloud computing (VCC). We proposed a framework to identify and register the smart GPS devices over VCC service and allow the users to monitor the smart GPS device in real-time. The proposed framework divides into three parts: First, data anonymization of users' information over VCC, by masking the original data of the user and replaced with fake data. The proposed technique will remove the user identity and other linker to identify the users. Second, proposed a technique using an asymmetric cryptography (RSA) technique, the proposed technique provides location privacy of users trajectories before requesting point of interest (POI) from location-based services (LBS). Third, secure communication between users and the VCC, based on Token-based authentication by authenticating the trusted users while requesting a location from the VCC service. The proposed framework shows the efficiency and reliability of responding to the user trajectories from different sources of GPS devices and data-sets.
- Published
- 2022
- Full Text
- View/download PDF
4. Deceptive Infusion of Data: A Novel Data Masking Paradigm for High-Valued Systems.
- Author
-
Sundaram, Arvind, Abdel-Khalik, Hany, and Al Rashdan, Ahmad
- Subjects
- *
MANUFACTURING processes , *INDUSTRIAL engineering , *PRODUCTION engineering , *NUCLEAR reactors , *ARTIFICIAL intelligence , *MACHINE learning - Abstract
This work addresses how analysts of a high-valued system (e.g., nuclear reactor, aircraft turbine designs) can extract findable, accessible, interoperable, and reusable scientific data for public dissemination to artificial intelligence and machine-learning (AI/ML) researchers in a manner that cannot be reverse-engineered, potentially compromising sensitive or proprietary information. State-of-the-art methods address this problem through data masking techniques, which allow access to a subset of the information while obfuscating private and potentially identifying information (e.g., personally identifying medical data). These methods are unsuitable for industrial engineering processes, where AI/ML tools need explicit access to all the data available to draw the best inference about the system to help optimize its performance and identify its vulnerabilities, etc. Our novel deceptive infusion of data paradigm provides a solution to this conundrum by developing a mathematical approach capable of concealing the identity of the system while providing full access to all the features employed by AI/ML tools to ensure their optimal performance. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
5. A dynamic anonymization privacy-preserving model based on hierarchical sequential three-way decisions.
- Author
-
Qian, Jin, Zheng, Mingchen, Yu, Ying, Zhou, Chuanpeng, and Miao, Duoqian
- Subjects
- *
DATA privacy , *DATA security , *GENERALIZATION , *PRIVACY , *ANONYMITY , *ALGORITHMS - Abstract
Data anonymization is one of the common techniques for ensuring data security and privacy. However, the existing anonymization techniques often suffer lower execution efficiency and unnecessary information loss when dealing with complex data. Therefore, we propose a dynamic anonymity privacy-preserving model based on hierarchical sequential three-way decisions. Specifically, we first divide the data into multiple granularity spaces by attributes and dynamically process the data in the granularity spaces. Then, in a single granularity space, we construct a generalization hierarchy for the data based on the attributes generalization trees and divide it into the positive, negative and boundary regions based on anonymous parameter. Next, we can acquire the positive and boundary regions by generalization and dynamically update the processed data at the next granularity. After that, we suppress the data in the final negative and boundary regions while releasing the positive region. To further improve data availability, we combine the idea of differential privacy by adding noise data to the final boundary region enabling its release and propose an enhanced anonymity model. Finally, we compare our proposed algorithms with other methods on six datasets. Experimental results show that our method effectively reduces processing costs, improves data usability and protects data privacy. • We construct a data generalization hierarchy using attribute generalization trees and define a multi-hierarchical decision table for k-anonymity. • We process data dynamically by dividing it into granularity spaces based on attribute sets and anonymize using different strategies. • We incorporate hierarchical sequential three-way decisions to propose a novel k-anonymity model. • We also introduce differential privacy to propose an enhanced model that further improves data availability. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
6. Efficient Q-Value Zero-Leakage Protection Scheme in SRS Regularly Publishing Private Data
- Author
-
Zongmin Cui, Lungui Zhang, Bin Wu, Zhiqiang Zhao, Zhuolin Mei, and Zongda Wu
- Subjects
data anonymity ,privacy protection ,Q-value zero-leakage ,regularly publishing private data ,spontaneous reporting system ,Engineering (General). Civil engineering (General) ,TA1-2040 - Abstract
Spontaneous Reporting System (SRS) has been widely established to collect adverse drug events. Thus, SRS promotes the detection and analysis of ADR (adverse drug reactions), such as the FDA Adverse Event Reporting System (FAERS). The SRS data needs to be provided to researchers. Meanwhile, the SRS data is publicly available to facilitate the study of ADR detection and analysis. In general, SRS data contains private information of some individual characteristics. Before the information is published, it is necessary to anonymize private information in the SRS data to prevent disclosure of individual privacy. There are many privacy protection methods. The most classic method for protecting SRS data is called as PPMS. However, in the real world, SRS data is growing dynamically and needs to be published regularly. In this case, PPMS has some shortcomings in the memory consumption, anonymity efficiency, data update and data security. To remove these shortcomings, we propose an Efficient Q-value Zero-leakage protection Scheme in SRS regularly publishing private data, called EQZS. EQZS can deal with almost all of potential attacks. Meanwhile, EQZS removes the shortcomings of PPMS. The experimental results show that our scheme EQZS solves the problem of privacy leakage in SRS regularly publishing private data. Meanwhile, EQZS significantly outperforms PPMS on the efficiency of memory consumption, privacy anonymity and data update.
- Published
- 2019
- Full Text
- View/download PDF
7. Supporting Streaming Data Anonymization with Expressions of User Privacy Preferences
- Author
-
Sakpere, Aderonke Busayo, Kayem, Anne V. D. M., Diniz Junqueira Barbosa, Simone, Series editor, Chen, Phoebe, Series editor, Du, Xiaoyong, Series editor, Filipe, Joaquim, Series editor, Kara, Orhun, Series editor, Liu, Ting, Series editor, Kotenko, Igor, Series editor, Sivalingam, Krishna M., Series editor, Washio, Takashi, Series editor, Camp, Olivier, editor, Weippl, Edgar, editor, Bidan, Christophe, editor, and Aïmeur, Esma, editor
- Published
- 2015
- Full Text
- View/download PDF
8. Data Privacy against Composition Attack
- Author
-
Baig, Muzammil M., Li, Jiuyong, Liu, Jixue, Ding, Xiaofeng, Wang, Hua, Hutchison, David, editor, Kanade, Takeo, editor, Kittler, Josef, editor, Kleinberg, Jon M., editor, Mattern, Friedemann, editor, Mitchell, John C., editor, Naor, Moni, editor, Nierstrasz, Oscar, editor, Pandu Rangan, C., editor, Steffen, Bernhard, editor, Sudan, Madhu, editor, Terzopoulos, Demetri, editor, Tygar, Doug, editor, Vardi, Moshe Y., editor, Weikum, Gerhard, editor, Lee, Sang-goo, editor, Peng, Zhiyong, editor, Zhou, Xiaofang, editor, Moon, Yang-Sae, editor, Unland, Rainer, editor, and Yoo, Jaesoo, editor
- Published
- 2012
- Full Text
- View/download PDF
9. Anonymity and Application Privacy in Context of Mobile Computing in eHealth
- Author
-
Slamanig, Daniel, Stingl, Christian, Menard, Christian, Heiligenbrunner, Martina, Thierry, Jürgen, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Nierstrasz, Oscar, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Sudan, Madhu, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Vardi, Moshe Y., Series editor, Weikum, Gerhard, Series editor, Löffler, Jobst, editor, and Klann, Markus, editor
- Published
- 2009
- Full Text
- View/download PDF
10. Collecting network data from documents to reach non-participatory populations
- Author
-
Daniel Tischer
- Subjects
050402 sociology ,Sociology and Political Science ,Computer science ,Population ,Documents ,MGMT theme Global Political Economy ,0504 sociology ,050602 political science & public administration ,Data anonymity ,education ,General Psychology ,Offering circular ,education.field_of_study ,Data collection ,05 social sciences ,Network data ,General Social Sciences ,Citizen journalism ,MGMT International Business Management and Strategy ,Data science ,0506 political science ,networks ,Anthropology ,career networks ,Coding (social sciences) - Abstract
Collecting social network data is challenging, not least because conventional approaches rely on human participation. However, there are instances where access to research subjects is restricted or non-existent, especially in the high-stakes commercial world. This paper outlines the collection of network data from a relatively obscure financial document – the offering circular. I consider the implications of dealing with a non-participatory population and data that is not produced for social network research. In exploring the process of translating data into social network data I highlight the importance of retaining context and qualitative descriptions in the data. I also consider how different coding strategies impact the network. Finally, I discuss the triangulation, data anonymity and potential ethical-legal implications of collecting data from documents.
- Published
- 2022
11. Data Re-Identification—A Case of Retrieving Masked Data from Electronic Toll Collection
- Author
-
Hsieh-Hong Huang, Jian-Wei Lin, and Chia-Hsuan Lin
- Subjects
data anonymity ,open data ,de-identification ,re-identification ,electronic toll collection ,Mathematics ,QA1-939 - Abstract
With the growth of big data and open data in recent years, the importance of data anonymization is increasing. Original data need to be anonymized to prevent personal identification from being revealed before being released to the public. There is a growing variety of de-identification methods which have been proposed to reduce the privacy issues, however, there is still much to be improved. The purpose of this study is to demonstrate the possibilities of re-identification from masked data, and to compare the pros and cons of different de-identification methods. A set of electronic toll collection data from Taiwan was used and we successfully re-identified vehicles with specific patterns. Four de-identification methods were performed and finally we compared the strengths and weaknesses of these methods and evaluated their appropriateness.
- Published
- 2019
- Full Text
- View/download PDF
12. Privacy Enhancing Technologies: A Review
- Author
-
Argyrakis, John, Gritzalis, Stefanos, Kioulafas, Chris, Goos, Gerhard, editor, Hartmanis, Juris, editor, van Leeuwen, Jan, editor, and Traunmüller, Roland, editor
- Published
- 2003
- Full Text
- View/download PDF
13. Data anonymity and truthfulness
- Author
-
Giorgos Poulis
- Subjects
Computer science ,business.industry ,Internet privacy ,Data anonymity ,Computer security ,computer.software_genre ,business ,computer - Abstract
Σήμερα, η καθολική υιοθέτηση της τεχνολογίας, παρέχει στους χρήστες τη δυνατότητα να διαμοιράζουν πολλές πτυχές της ζωής τους. Τα κοινωνικά δίκτυα, τα έξυπνα τηλέφωνα (smartphones) με ενσωματωμένο GPS και οι φορητοί βιομετρικοί αισθητήρες αποτελούν μόνο μερικά παραδείγματα τεχνολογιών καταγραφής δεδομένων. Παρά το γεγονός ότι τα ονόματα των χρηστών αφαιρούνται πριν από τη δημοσίευση των δεδομένων, τα δημοσιευμένα δεδομένα μπορούν να οδηγήσουν σε παραβιάσεις της ιδιωτικότητάς τους. Για παράδειγμα, η ταυτότητα ενός ατόμου, η διεύθυνση κατοικίας του, τα οικονομικά του στοιχεία ή ακόμα και το ιατρικό ιστορικό μπορούν να προσδιοριστούν. Από την άλλη πλευρά, τα δεδομένα αυτά είναι εξαιρετικά χρήσιμα σε διάφορες μελέτες, όπως δημογραφικές μελέτες, μελέτες ανάλυσης της ανθρώπινης συμπεριφοράς καθώς και ιατρικές μελέτες. Έτσι, η μετατροπή αυτών των δεδομένων για την προστασία των χρηστών, κατά τρόπο ώστε τα δεδομένα να παραμένουν χρήσιμα στους ερευνητές, είναι ζωτικής σημασίας. Αρκετές μέθοδοι ανωνυμίας έχουν προταθεί για την προστασία των δεδομένων, όπως η k-anonymity, η l-diversity, k^m-anonymity και η differential privacy. Αυτές οι μέθοδοι μπορούν να εφαρμοστούν είτε σε σχεσιακά δεδομένα (π.χ., δεδομένα που περιλαμβάνουν τα δημογραφικά στοιχεία) ή σε δεδομένα συναλλαγών (π.χ., σύνολα δεδομένων που περιλαμβάνουν όλα τα αντικείμενα που αγόρασε ή οι τοποθεσίες που επισκεύθηκε κάποιο άτομο). Καθώς η ανωνυμία σε σχεσιακά δεδομένα έχει μελετηθεί ευρέως, εστιάζουμε την έρευνά μας σε μεθόδους που αφορούν δεδομένα συναλλαγών, με έμφαση στα γεωχωρικά δεδομένα. Επίσης εισαγάγουμε την έννοια των RT-συνόλων δεδομένων, τα οποία είναι σύνολα δεδομένων που περιέχουν και σχεσιακά δεδομένα καθώς και δεδομένα συναλλαγών (π.χ., δημογραφικά στοιχεία και διαγνώσεις ασθενών). Σε σύνολα δεδομένων τροχιών, προτείνουμε διάφορες μεθόδους για να επιτευχθεί η ανωνυμία και να ελαχιστοποιηθεί η απώλεια της χρησιμότητας των δεδομένων μετά την ανωνυμοποίηση τους. Αρχικά, παρουσιάζουμε μια προσέγγιση η οποία υιοθετεί την μέθοδο k^m-anonymity σε δεδομένα τροχιών, χρησιμοποιώντας γενίκευση με βάση την απόσταση. Επίσης αναπτύσσουμε έναν αποτελεσματικό και αποδοτικό αλγόριθμο ανωνυμίας, ο οποίος βασίζεται στην αρχή Apriori. Προτείνουμε δύο ακόμα αλγορίθμους, οι οποίοι διατηρούν διαφορετικά χαρακτηριστικά των δεδομένων. Τα χαρακτηριστικά αυτά είναι η ελαχιστοποίηση των αποστάσεων μεταξύ σημείων, καθώς και η ικανοποίηση διαφόρων απαιτήσεων χρησιμότητας (οι κανόνες αυτοί ορίζονται από το χρήστη). Η ικανοποίηση αυτών των απαιτήσεων διασφαλίζει ότι τα δεδομένα που παράγονται είναι χρήσιμα και η ανάλυσή τους θα παρέχει ουσιαστικά αποτελέσματα.Ακολούθως, προτείνουμε ένα νέο πλαίσιο για την ανωνυμοποίηση των δεδομένων τροχιών, το οποίο αποτρέπει την αποκάλυψη της ταυτότητας καθώς και ευαίσθητων πληροφοριών για τις θέσεις τον ατόμων, διατηρώντας παράλληλα τη χρησιμότητα των δεδομένων.Το πλαίσιό μας αυτό περιλαμβάνει: (α) την επιλογή παρόμοιων τροχιών, με τη χρησιμοποίηση είτε Z-ordering είτε με προβολές των σημείων στις πιο συχνά εμφανίσιμες υποτροχιές, (β) την οργάνωση των επιλεγμένων τροχιών σε προσεκτικά κατασκευασμένες ομάδες, και (γ) την ανωνυμοποίηση κάθε ομάδας ξεχωριστά.Τέλος, παρουσιάζουμε τέσσερις μεθόδους, για την προστασία από χρήστες, οι οποίοι ελέγχουν διαφορετικά σετ από σημεία. Οι μέθοδοί μας, χρησιμοποιώντας διαγραφή σημείων ή διαχωρισμό τροχιών, μετασχηματίζουν τις τροχιές, ώστε να αποτρέψουν αυτούς τους χρήστες να ανακαλύψουν στοιχεία επισκεψιμότητας για σημεία που δεν ελέγχουν.Στη συνέχεια, εστιάζουμε στα RT-σύνολα δεδομένων. Η διαφύλαξη της ιδιωτικότητας και της χρησιμότητας στα RT-σύνολα δεδομένων είναι δύσκολη, καθώς απαιτεί (α) την προστασία από κακόβουλους χρήστες, των οποίων η γνώση εκτείνεται και στους δύο τύπους χαρακτηριστικών, και (β) τη διατήρηση της μέγιστης χρηστικότητας του ανώνυμου συνόλου δεδομένων. Οι υπάρχουσες τεχνικές ανωνυμίας δεν μπορούν να εφαρμοστούν σε τέτοια σύνολα δεδομένων, και το πρόβλημα δεν μπορεί να αντιμετωπιστεί με βάση τις δημοφιλής, στρατηγικές βελτιστοποίησης πολλαπλών στόχων. Έτσι, προτείνουμε μια πρώτη προσέγγιση για την αντιμετώπιση αυτού του προβλήματος. Με βάση αυτή την προσέγγιση, έχουμε αναπτύξει δύο πλαίσια τα οποία προστατεύουν την ιδιωτικότητα, οριοθετώντας την απώλεια πληροφορίας στον έναν τύπο χαρακτηριστικού και διατηρώντας τη μέγιστη δυνατή πληροφορία στο άλλο. Για την εφαρμογή κάθε πλαισίου, προτείνουμε αλγορίθμους που διατηρούν αποτελεσματικά τη χρηστικότητα των δεδομένων, όπως δείχνουμε και με εκτενή πειράματα. Παρουσιάζουμε τέλος ένα ολοκληρωμένο σύστημα, το SECRETA, προκειμένου να επιτρέψουμε σε χρήστες με ελάχιστες τεχνικές γνώσεις, να χρησιμοποιήσουν και να αξιολογήσουν τις διαθέσιμες μεθόδους ανωνυμίας. Συγκεκριμένα, το SECRETA επιτρέπει στους χρήστες να αξιολογήσουν μια μέθοδο ανωνυμίας, να συγκρίνουν μεταξύ τους διαφορετικές μεθόδους καθώς και να συνδυάσουν μεθόδους για την προστασία συνόλων δεδομένων. Η ανάλυση των μεθόδων γίνεται με ένα διαδραστικό και προοδευτικό τρόπο, και τα αποτελέσματα, συμπεριλαμβανομένων των στατιστικών για διαφορετικά χαρακτηριστικά των δεδομένων καθώς και διαφόρων δεικτών της χρησιμότητας τους, συνοψίζονται και παρουσιάζονται γραφικά.
- Published
- 2021
14. Providing data anonymity for a secure database infrastructure.
- Author
-
Popeea, Traian, Constantinescu, Anca, and Rughinis, Razvan
- Abstract
The evolution of computer technology renders servers, workstations, computers and even mobile devices more and more powerful. This leads to their increased use and storage of data. Data warehouses may contain large quantities of sensitive information. Therefore, data privacy is a very important aspect of data publishing. The best method for protecting data privacy is data anonymity. The use of anonymous data improves the degree of privacy provided by data warehouses. However, data anonymity alone is never enough for protecting the privacy of sensitive information, requiring database security also. In this paper we propose a multi-layer approach to database anonymity based on a multi-layer security infrastructure. From an anonymity perspective, we have developed an engine that provides both static and dynamic means of securing sensitive data. Also, the temporal aspect of a dynamic database in time leads to an alteration of possible inferences. Also, in this paper we focused on securing the communication, the operating system and the database server. [ABSTRACT FROM PUBLISHER]
- Published
- 2013
- Full Text
- View/download PDF
15. Inference Detection and Database Security for a Business Environment.
- Author
-
Popeea, Traian, Constantinescu, Anca, Gheorghe, Laura, and Tapus, Nicolae
- Abstract
The number of data collections of person-specific information is increasing exponentially. The risks of compromising privacy are also increasing, but can be limited through data anonymity and data security. Security models, developed for databases, differ in many aspects because they focus on different features of the problem, leading to incomplete implementations of the organizational security strategy. Our paper presents a multi-layer approach to data anonymity and database security, covering the entire process from inference detection to secure communication and database integrity. This approach is useful for protecting sensitive data and assuring in-depth mitigation of different possible attacks. [ABSTRACT FROM PUBLISHER]
- Published
- 2012
- Full Text
- View/download PDF
16. Secure Data Management Technology
- Author
-
Atsuko Miyaji, Tomoaki Mimoto, and Shinsaku Kiyomoto
- Subjects
Data anonymization ,Computer science ,business.industry ,Data management ,Data_MISCELLANEOUS ,computer.software_genre ,Index (publishing) ,ComputingMilieux_COMPUTERSANDSOCIETY ,Data anonymity ,Use case ,Data mining ,business ,computer ,Adversary model ,Anonymity - Abstract
In this chapter, we introduce data anonymization techniques for several types of datasets. Data anonymity of anonymized datasets is an index for estimating the (maximum) reidentification risk from anonymized datasets and is generally defined as a quantitative index based on adversary models. The adversary models are implicitly defined according to the attributes in the datasets, use cases, and anonymization techniques. We first review existing anonymization techniques and the adversary models behind the data anonymity definitions for anonymization techniques; then, we propose a common anonymity definition and its adversary model, which is applicable to several types of anonymization techniques. Furthermore, some extensions of the definition, which is optimized for specific types of datasets, are presented in the chapter.
- Published
- 2020
17. A Probabilistic Perspective on Re-Identifiability.
- Author
-
KOOT, MATTHIJS, MANDJES, MICHEL, VAN 'T NOORDENDE, GUIDO, and DE LAAT, CEES
- Subjects
- *
AGE groups , *DEMOGRAPHIC characteristics , *DEMOGRAPHY , *POPULATION , *DEMOGRAPHIC research - Abstract
A quasi-identifier is a set of attributes that can be used to re-identify entries in anonymized data sets. A group of individuals is considered about whom quasi-identifying numerical information is disclosed such as date of birth, age, weight, and height. The fraction of individuals is determined whose information is unique in that group and hence is identifiable unambiguously. Nonuniformity can be captured well by a single number, the Kullback-Leibler distance. For example sets of real microdata, given approximations based on Kullback-Leibler distances are accurate. Second, the effect of disclosing more specific or less specific information is analyzed experimentally. Third, the effect of correlation between numerical attributes is measured. A formula gives the re-identifiability level. The approximations are validated using publicly available demographic data sets. [ABSTRACT FROM AUTHOR]
- Published
- 2013
- Full Text
- View/download PDF
18. Data obfuscation: anonymity and desensitization of usable data sets.
- Author
-
Bakken, D.E., Rarameswaran, R., Blough, D.M., Franz, A.A., and Palmer, T.J.
- Abstract
In some domains, the need for data privacy and data sharing conflict. Data obfuscation addresses this dilemma by extending several existing technologies and defining obfuscation properties that quantify the technologies' usefulness and privacy preservation. [ABSTRACT FROM PUBLISHER]
- Published
- 2004
- Full Text
- View/download PDF
19. Efficient Q-Value Zero-Leakage Protection Scheme in SRS Regularly Publishing Private Data
- Author
-
Bin Wu, Zongda Wu, Zongmin Cui, Zhuolin Mei, Lungui Zhang, and Zhiqiang Zhao
- Subjects
data anonymity ,Computer science ,Q value ,privacy protection ,Q-value zero-leakage ,Privacy protection ,regularly publishing private data ,spontaneous reporting system ,General Engineering ,Topology ,lcsh:TA1-2040 ,Data anonymity ,lcsh:Engineering (General). Civil engineering (General) ,Leakage (electronics) - Abstract
Spontaneous Reporting System (SRS) has been widely established to collect adverse drug events. Thus, SRS promotes the detection and analysis of ADR (adverse drug reactions), such as the FDA Adverse Event Reporting System (FAERS). The SRS data needs to be provided to researchers. Meanwhile, the SRS data is publicly available to facilitate the study of ADR detection and analysis. In general, SRS data contains private information of some individual characteristics. Before the information is published, it is necessary to anonymize private information in the SRS data to prevent disclosure of individual privacy. There are many privacy protection methods. The most classic method for protecting SRS data is called as PPMS. However, in the real world, SRS data is growing dynamically and needs to be published regularly. In this case, PPMS has some shortcomings in the memory consumption, anonymity efficiency, data update and data security. To remove these shortcomings, we propose an Efficient Q-value Zero-leakage protection Scheme in SRS regularly publishing private data, called EQZS. EQZS can deal with almost all of potential attacks. Meanwhile, EQZS removes the shortcomings of PPMS. The experimental results show that our scheme EQZS solves the problem of privacy leakage in SRS regularly publishing private data. Meanwhile, EQZS significantly outperforms PPMS on the efficiency of memory consumption, privacy anonymity and data update.
- Published
- 2019
20. MR-Anonymization: A Relationship-based Privacy Model
- Author
-
Srikanth Gampa and Abdulrahman Almohaimeed
- Subjects
Information privacy ,Order (exchange) ,Computer science ,business.industry ,Internet privacy ,Data anonymity ,business ,Cluster analysis ,Semantics ,Publication ,Data modeling ,Privacy model - Abstract
There are several reasons for organizations to publish or share their data. Therefore, ensuring the privacy of individual information is a serious issue. A typical medical organization must publish data about thousands of patients that contain detailed information about each patient. There may be several vulnerable relationships in the data that may lead to identities being exposed. For example, certain diseases are usually associated with groups of a particular age, gender, location, or ethnicity. Data exposure is not limited to specific types of attacks; attackers often try to find vulnerable relationships between data that may lead to exposure of identities. Therefore, the clustering method must be used to find more relationships between large amounts of data. The model provided in this paper aims to improve the concept of data anonymity by proposing an anonymization method that focuses on critical relationships between data. The main idea behind MR-Anonymization is to apply the clustering technique in order to find leakages in such a large dataset.
- Published
- 2019
21. Privacy Preservation Using Various Anonymity Models
- Author
-
Deepak Narula, Shuchita Upadhyaya, and Pardeep Kumar
- Subjects
business.industry ,Computer science ,k-anonymity ,Linkage (mechanical) ,Computer security ,computer.software_genre ,law.invention ,Information sensitivity ,Software ,law ,Data anonymity ,business ,computer ,Anonymity - Abstract
Need of collection and sharing of data is increasing day by day as it is the requirement of today’s society. While publishing data, one has to guarantee that sensitive information should be made secret so that no one is able to misuse it. For this purpose, one can use various methods and techniques of anonymization. A number of recent researchers are focusing on proposing different anonymity algorithms and techniques to keep published data secret. In this paper, a review of various methods of anonymity with different anonymity operators and various types of linkage attacks has been done. An analysis of the performance of various anonymity algorithms on the basis of various parameters on different data sets using ARX data anonymity software has been done in the end.
- Published
- 2018
22. Data Anonymity in the FOO Voting Scheme.
- Author
-
Mauw, S., Verschuren, J., and de Vink, E.P.
- Subjects
PRIVACY ,DATA security ,ACCESS to information ,COMPUTER security ,ELECTRONIC voting ,ELECTRONIC systems - Abstract
Abstract: We study one of the many aspects of privacy, which is referred to as data anonymity, in a formal context. Data anonymity expresses whether some piece of observed data, such as a vote, can be attributed to a user, in this case a voter. We validate the formal treatment of data anonymity by analyzing a well-known electronic voting protocol. [Copyright &y& Elsevier]
- Published
- 2007
- Full Text
- View/download PDF
23. Research progress of anonymous data release
- Author
-
Zhao Yan, Wei Weimin, Kong Zhiwei, Yang Shuo, and Feng Hua
- Subjects
Information privacy ,Process (engineering) ,Computer science ,ComputingMilieux_COMPUTERSANDSOCIETY ,Data anonymity ,Data publishing ,Communications system ,Data release ,Data science - Abstract
Anonymous data publishing technology is one of the methods proposed in recent years for data privacy disclosure problems that may exist in the data release process. This paper introduces the research results of anonymous data release in recent years, analyzes the existing anonymous data release model according to different data background, classifies and discusses the existing anonymous data distribution methods, summarizes the advantages and disadvantages of each method and its application, the development trend of data anonymity is prospected. Finally, some problems to be solved are put forward.
- Published
- 2017
24. $$\delta $$-privacy: Bounding Privacy Leaks in Privacy Preserving Data Mining
- Author
-
Zhizhou Li and Ten-Hwang Lai
- Subjects
Delta ,Computer science ,Data_MISCELLANEOUS ,05 social sciences ,Adversary ,Computer security ,computer.software_genre ,Privacy preserving ,03 medical and health sciences ,0302 clinical medicine ,Bounding overwatch ,0502 economics and business ,Evaluation methods ,ComputingMilieux_COMPUTERSANDSOCIETY ,Data anonymity ,Differential privacy ,030212 general & internal medicine ,Data mining ,computer ,050203 business & management ,Intuition - Abstract
We propose a new definition for privacy, called \(\delta \)-privacy, for privacy preserving data mining. The intuition of this work is, after obtaining a result from a data mining method, an adversary has better ability in discovering data providers’ privacy; if this improvement is large, the method, which generated the response, is not privacy considerate. \(\delta \)-privacy requires that no adversary could improve more than \(\delta \). This definition can be used to assess the risk of privacy leak in any data mining methods, in particular, we show its relations to differential privacy and data anonymity, the two major evaluation methods. We also provide a quantitative analysis on the tradeoff between privacy and utility, rigorously prove that the information gains of any \(\delta \)-private methods do not exceed \(\delta \). Under the framework of \(\delta \)-privacy, it is able to design a pricing mechanism for privacy-utility trading system, which is one of our major future works.
- Published
- 2017
25. Embrace data anonymity, not ‘digital consent’
- Author
-
Neil Seeman
- Subjects
Internet ,Risk Management ,Informed Consent ,Multidisciplinary ,Computer science ,business.industry ,Data Anonymization ,Health Care Surveys ,Internet privacy ,Humans ,Data anonymity ,business - Published
- 2019
26. Carbon Heroes Benchmark Program – whole building embodied carbon profiling
- Author
-
Rodrigo Castro and P Pasanen
- Subjects
Architectural engineering ,Embodied carbon ,Profiling (information science) ,Data anonymity ,Stock (geology) - Abstract
Reducing the embodied carbon of the building stock requires a better understanding of the life cycle impacts of the materials used in those buildings. However, the characteristics of the building stock vary significantly by geography and building type. The “Carbon heroes benchmark program” is a cooperative initiative for carbon profiling by building type across different countries. The program’s aim is to create uniform, full life-cycle of materials benchmarks for common building types. The benchmark program is on track to achieve 1000 fully completed and verified buildings by end of 2019, and contains data breakdowns for over 100 different material types and essential structural parts of a building. All data used in the program is rigorously anonymized and statistically small sets of data are also not used to protect data anonymity. The program implements the EN 15978/ISO 21930 standards as the basis of measurement, and includes life-cycle stages A1-A3, A4, B4-B5, and C1-C4. This presentation will share the preliminary findings of this project. 659 verified buildings (February 2019 cutoff), with substantial datasets for many European countries for some of the most common building types. The benchmark is generated using One Click LCA.
- Published
- 2019
27. Scalable Distributed Two-Layer Datastore Providing Data Anonymity
- Author
-
Adam Krechowicz
- Subjects
Information privacy ,business.industry ,Computer science ,Distributed computing ,Scalable distributed ,Two layer ,Data anonymity ,Data system ,Cloud computing ,business ,Base (topology) ,Computer network - Abstract
Storing data in a public data systems (mostly in the Cloud) can lead to many considerations about the data privacy. Are our data completely safe? Inspired by these considerations the author started to develop efficient framework which can be used to improve data privacy while storing them in public data storages. The Scalable Distributed Two-Layer Datastore was used as a base for the framework because it proved to be very efficient solution to store huge data sets.
- Published
- 2016
28. Data Anonymity Decision
- Author
-
Min-Kyoung Jung and Dong-Kweon Hong
- Subjects
Information privacy ,Privacy software ,business.industry ,Computer science ,Data_MISCELLANEOUS ,Internet privacy ,Privacy protection ,Computer security ,computer.software_genre ,Information sensitivity ,Data anonymity ,business ,computer ,Anonymity - Abstract
The research of the preserving privacy of sensitive information has been popular recently. Many researches about the techniques of generalizing records under k-anonymity rules have been done. Considering that data anonymity requires a lot of time and resources, it would be important to decide whether a table is vulnerable to privacy attacks before being opened in terms of the improvement of data utilization as well as the privacy protection. It is also important to check to which attack the table is vulnerable and which of anonymity methods should be applied in the table. This paper describe two possible privacy attacks based upon related references. Also, we suggest the technique to check whether data table is vulnerable to any attack of them and describe what kind of anonymity methods should be done in the table. The technique we suggest in this paper can also be applied for checking the safety of anonymity tables in which insert or delete operations occurred as well from privacy attacks.
- Published
- 2010
29. Quality control program: the radiology technician approach
- Author
-
Vitor Manuel Costa Pereira Rodrigues and Helga Alexandra Soares Macedo
- Subjects
Medical education ,business.industry ,Health care quality assurance ,Efeitos de radiação ,Radiation effects ,Quality control ,Public institution ,Sample (statistics) ,Garantia da qualidade dos cuidados de saúde ,Proteção radiológica ,Health promotion ,Radiological weapon ,Medicine ,Data anonymity ,Radiology, Nuclear Medicine and imaging ,Confidentiality ,business ,Radiological protection - Abstract
OBJETIVO: Pretendeu-se averiguar que importância os técnicos de radiologia atribuem à implementação de um programa de controle de qualidade em radiologia, e conhecer a importância da existência de critérios de proteção. MATERIAIS E MÉTODOS: Estudo descritivo e transversal. Os dados foram recolhidos por meio de um questionário (quatro partes), tendo sido garantido o anonimato e a confidencialidade dos dados. Participaram neste estudo 48 técnicos de radiologia que exercem funções em instituições de saúde, situadas no Distrito de Vila Real (norte de Portugal). RESULTADOS: Dos técnicos de radiologia participantes do estudo, 62,5% não sabem em que consiste um programa de controle de qualidade em radiologia, mas 85,4% consideram muito importante a sua implementação nos seus serviços, e 89,6% consideram que a sua implementação seria um fator de motivação. Verificamos também que as instituições estudadas (hospitais e centros de saúde) não se encontram adequadas com os princípios básicos da radioproteção. CONCLUSÃO: Embora os técnicos de radiologia não saibam em que consiste um programa de controle de qualidade em radiologia, estariam dispostos a colaborar na sua elaboração. Este estudo permitiu constatar uma realidade que pensávamos não ser possível existir: instituições públicas, cuja missão se baseia na promoção da saúde, ignorarem as não conformidades existentes nos diferentes serviços, no que diz respeito à proteção radiológica. OBJECTIVE: The present study was aimed at evaluating the importance given by radiology technicians to the implementation of a quality control program and the existence of radiological protection criteria in their centers. MATERIALS AND METHODS: The data for the present descriptive and cross-sectional study were collected by means of a four-module questionnaire, with data anonymity and confidentiality being assured. The sample consisted of 48 radiology technicians working in health institutions of the District of Vila Real (North of Portugal). RESULTS: Among the radiology technicians participating in the present study, 62.5% do not know what a quality control program is, although its implementation is considered as very important for their centers by 85.4% and 89.6% consider that its implementation would be a motivating factor. Also, the authors have observed that hospitals and health centers evaluated are not in compliance with the basic principles of radiation protection. CONCLUSION: Although the radiology technicians do not know what a quality control program is, they are willing to collaborate in the elaboration of this program. The present study has allowed the authors to testify a supposedly inexistent reality: public institutions whose mission is based on health promotion ignoring the non-compliance with principles of radiological protection.
- Published
- 2009
30. Data Re-Identification—A Case of Retrieving Masked Data from Electronic Toll Collection.
- Author
-
Huang, Hsieh-Hong, Lin, Jian-Wei, and Lin, Chia-Hsuan
- Subjects
- *
TOLL collection , *IDENTIFICATION , *BIG data , *COMPUTER user identification , *ACQUISITION of data - Abstract
With the growth of big data and open data in recent years, the importance of data anonymization is increasing. Original data need to be anonymized to prevent personal identification from being revealed before being released to the public. There is a growing variety of de-identification methods which have been proposed to reduce the privacy issues, however, there is still much to be improved. The purpose of this study is to demonstrate the possibilities of re-identification from masked data, and to compare the pros and cons of different de-identification methods. A set of electronic toll collection data from Taiwan was used and we successfully re-identified vehicles with specific patterns. Four de-identification methods were performed and finally we compared the strengths and weaknesses of these methods and evaluated their appropriateness. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
31. A Framework for the Secure Storage of Data Generated in the IoT
- Author
-
António Pinto and Ricardo Costa
- Subjects
Database ,business.industry ,Computer science ,Order (business) ,Server ,Data anonymity ,The Internet ,Internet of Things ,business ,computer.software_genre ,Computer security ,computer - Abstract
The Internet of Things can be seen has a growing number of things that inter-operate using an Internet-based infrastructure and that has evolved during the last years with little concern for the privacy of its users, especially regarding how the collected data is stored. Technological measures ensuring users privacy must be established. In this paper we will present a technological framework for the secure storage of data. Things can then interact with the framework’s API much in the same way they now interact with its current servers, after which, the framework will perform the required operations in order to secure the data before storing it. The methods adopted for the secure storage will maintain the sharing ability, conveniently allowing authorized access to other users, the initial user’s terms (e.g. data anonymity) and the ability to revoke assigned privileges at all times.
- Published
- 2015
32. Privacy preserving for Big Data Analysis
- Author
-
Russom, Yohannes
- Subjects
K-anonymity ,data anonymity ,Hadoop ,Technology: 500::Information and communication technology: 550 [VDP] ,datateknikk ,informasjonsteknologi ,quasi-identifiers ,MapReduce ,primary identifiers ,privacy - Abstract
Master's thesis in Computer Science The Safer@Home [6] project at the University of Stavanger aims to create a smart home system capturing sensor data from homes into it’s data cluster. To provide assistive services through data analytic technologies, sensor data has to be collected centrally in order to effectively perform knowledge discovery algorithms. This Information collected from such homes is often very sensitive in nature and needs to be protected while processing or sharing across the value chain. Data has to be perturbed to protect against the disclosure and misuse by adversaries. Anonymization is the process of perturbing data by generalizing and suppresing identifiers which could be a potential threat by linking them with publicly available databases. There is a great challenge of maintaining privacy while still retaining the utitlity of the data. This thesis evaluates various anonymization methods that suits our require- ments. We present the software requirement specification of an anonymization framework and provide the practical implementation of a well accepted privacy preserving anonymization algorithm called Mondrian [7]. To quantify the in- formation loss during the anonymization process, a framework is proposed to evaluate the anonymized dataset. Moreover, it proposes the distributed method for solving the anonymization process using the Hadoop MapReduce framework to make a scalable system for big data analysis.
- Published
- 2013
33. Providing Group Anonymity Using Wavelet Transform
- Author
-
Dan Tavrov and Oleg Chertov
- Subjects
Public access ,Wavelet ,Group (mathematics) ,Computer science ,Simple (abstract algebra) ,Digital data ,Data anonymity ,Wavelet transform ,Data mining ,computer.software_genre ,computer ,Anonymity - Abstract
Providing public access to unprotected digital data can pose a threat of unwanted disclosing the restricted information. The problem of protecting such information can be divided into two main subclasses, namely, individual and group data anonymity. By group anonymity we define protecting important data patterns, distributions, and collective features which cannot be determined through analyzing individual records only. An effective and comparatively simple way of solving group anonymity problem is doubtlessly applying wavelet transform. It's easy-to-implement, powerful enough, and might produce acceptable results if used properly. In the paper, we present a novel method of using wavelet transform for providing group anonymity; it is gained through redistributing wavelet approximation values, along with simultaneous fixing data mean value and leaving wavelet details unchanged (or proportionally altering them). Moreover, we provide a comprehensive example to illustrate the method.
- Published
- 2012
34. Data Anonymity in Multi-Party Service Model
- Author
-
Yutaka Miyake, Kazuhide Fukushima, and Shinsaku Kiyomoto
- Subjects
World Wide Web ,Computer science ,Data_MISCELLANEOUS ,Data anonymity ,Table (database) ,k-anonymity ,Service provider ,Service model - Abstract
Existing approaches for protecting privacy in public database consider a service model where a service provider publishes public datasets that consist of data gathered from clients. We extend the service model to the multi-service providers setting. In the new model, a service provider obtains anonymized datasets from other service providers who gather data from clients and then publishes or uses the anonymized datasets generated from the obtained anonymized datasets. We considered a new service model that involves more than two data holders and a data user, and proposed a new privacy requirement. Furthermore, we discussed feasible approaches searching a table that satisfies the privacy requirement and showed a concrete algorithm to find the table.
- Published
- 2011
35. Privacy Inference Disclosure Control with Access-Unrestricted Data Anonymity
- Author
-
Zude Li
- Subjects
Computer science ,Privacy software ,business.industry ,Data_MISCELLANEOUS ,Control (management) ,Internet privacy ,ComputingMilieux_COMPUTERSANDSOCIETY ,Data anonymity ,Inference ,Computer security ,computer.software_genre ,business ,computer - Abstract
This chapter introduces a formal study on access-unrestricted data anonymity. It includes four aspects: (1) it analyzes the impacts of anonymity on data usability; (2) it quantitatively measures privacy disclosure risks in practical environment; (3) it discusses the potential factors leading to privacy disclosure; and (4) it proposes the improved anonymity solutions within typical k-anonymity models, which can effectively prevent privacy disclosure that is related with the published data properties, anonymity principles, and anonymization rules. The experiments have found these potential privacy inference violations and shown the enhanced privacy-preserving effect of the new anti-inference policies to access-unrestricted data publication.
- Published
- 2010
36. A cloud-based optimal fuzzy clustering of distributed data
- Author
-
Minyar Sassi Hidri and Rahma Souli-Jbali
- Subjects
Fuzzy clustering ,Computer science ,business.industry ,Distributed computing ,Computation ,Response time ,Cloud computing ,computer.software_genre ,Partition (database) ,Distributed algorithm ,Server ,Data anonymity ,Data mining ,business ,computer - Abstract
Cloud computing is an infrastructure that allows the storage of large datasets. It provides a great and parallel computing which permits a faster computation on distributed data. The contribution of this paper concerns the development of a cloud-based fuzzy clustering algorithm of distributed datasets while detecting the optimal partition in a global view. The proposed algorithm meets the confidentiality constraint which prohibits the sharing of data between different resources while guaranteeing the data anonymity located on the cloud servers. A series of experiments was conducted to evaluate the efficiency of the proposed algorithm. The obtained results show the performance of the proposed algorithm on both quality and response time components.
- Published
- 2015
37. Generalizing data to provide anonymity when disclosing information (abstract)
- Author
-
Pierangela Samarati and Latanya Sweeney
- Subjects
t-closeness ,Information privacy ,business.industry ,Computer science ,Internet privacy ,Data anonymity ,k-anonymity ,business ,Sensor fusion ,Re identification ,Anonymity - Published
- 1998
38. Evaluating the risk of re-identification of patients from hospital prescription records
- Author
-
Tyson Roffey, Régis Vaillancourt, Khaled El Emam, Fida Kamal Dankar, and Mark Lysyk
- Subjects
Pediatrics ,medicine.medical_specialty ,Political science ,medicine ,Data anonymity ,Pharmacology (medical) ,Pharmacy ,Medical prescription ,Humanities ,Article ,Re identification - Abstract
Background: Pharmacies often provide prescription records to private research firms, on the assumption that these records are de-identified (i.e., identifying information has been removed). However, concerns have been expressed about the potential that patients can be re-identified from such records. Recently, a large private research firm requested prescription records from the Children’s Hospital of Eastern Ontario (CHEO), as part of a larger effort to develop a database of hospital prescription records across Canada. Objective: To evaluate the ability to re-identify patients from CHEO’S prescription records and to determine ways to appropriately de-identify the data if the risk was too high. Methods: The risk of re-identification was assessed for 18 months’ worth of prescription data. De-identification algorithms were developed to reduce the risk to an acceptable level while maintaining the quality of the data. Results: The probability of patients being re-identified from the original variables and data set requested by the private research firm was deemed quite high. A new de-identified record layout was developed, which had an acceptable level of re-identification risk. The new approach involved replacing the admission and discharge dates with the quarter and year of admission and the length of stay in days, reporting the patient’s age in weeks, and including only the first character of the patient’s postal code. Additional requirements were included in the data-sharing agreement with the private research firm (e.g., audit requirements and a protocol for notification of a breach of privacy). Conclusions: Without a formal analysis of the risk of re-identification, assurances of data anonymity may not be accurate. A formal risk analysis at one hospital produced a clinically relevant data set that also protects patient privacy and allows the hospital pharmacy to explicitly manage the risks of breach of patient privacy. RESUME Contexte : Les pharmacies fournissent souvent des dossiers d’ordonnance aux firmes de recherche independantes, en supposant qu’ils sont depersonnalises (c.-a-d., que l’information pouvant identifier les patients a ete retiree). Cependant, des inquietudes ont ete soulevees quant a la possibilite que l’on puisse reconstituer l’identite des patients a partir de ces dossiers. Recemment, une importante firme de recherche independante a demande au Centre hospitalier pour enfants de l’est de l’Ontario (CHEO) d’obtenir les dossiers d’ordonnance, dans le cadre d’un projet plus vaste visant a developper une base de donnees pancanadienne des dossiers d’ordonnance hospitaliers. Objectif : Evaluer la possibilite de reconstituer l’identite des patients a partir des dossiers d’ordonnance du CHEO afin de determiner les moyens appropries de depersonnaliser les donnees si le risque de reconstitution est trop eleve. Methodes : Le risque de reconstitution de l’identite a ete evalue a partir de donnees sur les ordonnances couvrant une periode de 18 mois. Des algorithmes de depersonnalisation ont ete concus pour reduire le risque a un niveau acceptable, tout en maintenant la qualite des donnees. Resultats : La probabilite de reconstitution de l’identite des patients a partir des variables et des donnees originales demandees par la firme de recherche independante a ete jugee assez elevee. Une nouvelle methode de depersonnalisation des dossiers comportant un niveau de risque de reconstitution de l’identite acceptable a ete developpee. La nouvelle methode impliquait le remplacement des dates d’admission et de sortie par le trimestre et l’annee d’admission et la duree du sejour en jours, l’expression de l’âge du patient en semaines, et l’insertion uniquement du premier caractere du code postal du patient. D’autres exigences ont ete incluses dans l’entente de transmission de donnees avec la firme de recherche independante (p. ex., des exigences de verification et un protocole de declaration de violation de la vie privee). Conclusion : En l’absence d’analyse structuree du risque de reconstitution de l’identite, il est difficile d’assurer la depersonnalisation des donnees. Une analyse structuree du risque effectuee dans un hopital a genere un ensemble de donnees pertinent sur le plan clinique qui protege egalement la confidentialite des renseignements personnels des patients et permet a la pharmacie de l’hopital de gerer explicitement les risques de violation de la vie privee.
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.