Back to Search
Start Over
A Policy-Driven Approach to Secure Extraction of COVID-19 Data From Research Papers
- Source :
- Frontiers in Big Data, Vol 4 (2021), Frontiers in Big Data
- Publication Year :
- 2021
- Publisher :
- Frontiers Media SA, 2021.
-
Abstract
- The entire scientific and academic community has been mobilized to gain a better understanding of the COVID-19 disease and its impact on humanity. Most research related to COVID-19 needs to analyze large amounts of data in very little time. This urgency has made Big Data Analysis, and related questions around the privacy and security of the data, an extremely important part of research in the COVID-19 era. The White House OSTP has, for example, released a large dataset of papers related to COVID research from which the research community can extract knowledge and information. We show an example system with a machine learning-based knowledge extractor which draws out key medical information from COVID-19 related academic research papers. We represent this knowledge in a Knowledge Graph that uses the Unified Medical Language System (UMLS). However, publicly available studies rely on dataset that might have sensitive data. Extracting information from academic papers can potentially leak sensitive data, and protecting the security and privacy of this data is equally important. In this paper, we address the key challenges around the privacy and security of such information extraction and analysis systems. Policy regulations like HIPAA have updated the guidelines to access data, specifically, data related to COVID-19, securely. In the US, healthcare providers must also comply with the Office of Civil Rights (OCR) rules to protect data integrity in matters like plasma donation, media access to health care data, telehealth communications, etc. Privacy policies are typically short and unstructured HTML or PDF documents. We have created a framework to extract relevant knowledge from the health centers’ policy documents and also represent these as a knowledge graph. Our framework helps to understand the extent to which individual provider policies comply with regulations and define access control policies that enforce the regulation rules on data in the knowledge graph extracted from COVID-related papers. Along with being compliant, privacy policies must also be transparent and easily understood by the clients. We analyze the relative readability of healthcare privacy policies and discuss the impact. In this paper, we develop a framework for access control decisions that uses policy compliance information to securely retrieve COVID data. We show how policy compliance information can be used to restrict access to COVID-19 data and information extracted from research papers.
- Subjects :
- Big Data
Computer science
Privacy policy
Big data
Access control
Information technology
Telehealth
privacy
computer.software_genre
UMLS
Artificial Intelligence
Data integrity
Computer Science (miscellaneous)
Original Research
HIPAA
business.industry
Unified Medical Language System
COVID-19
T58.5-58.64
Data science
Information extraction
knowledge graph
restrict
business
computer
Information Systems
Subjects
Details
- ISSN :
- 2624909X
- Volume :
- 4
- Database :
- OpenAIRE
- Journal :
- Frontiers in Big Data
- Accession number :
- edsair.doi.dedup.....97b585f9f713a0662b5075c14e1a6934