Back to Search
Start Over
NeuroCrypt: Machine Learning Over Encrypted Distributed Neuroimaging Data
- Source :
- Neuroinformatics
- Publication Year :
- 2021
- Publisher :
- Springer Science and Business Media LLC, 2021.
-
Abstract
- The field of neuroimaging can greatly benefit from building machine learning models to detect and predict diseases, and discover novel biomarkers, but much of the data collected at various organizations and research centers is unable to be shared due to privacy or regulatory concerns (especially for clinical data or rare disorders). In addition, aggregating data across multiple large studies results in a huge amount of duplicated technical debt and the resources required can be challenging or impossible for an individual site to build. Training on the data distributed across organizations can result in models that generalize much better than models trained on data from any of organizations alone. While there are approaches for decentralized sharing, these often do not provide the highest possible guarantees of sample privacy that only cryptography can provide. In addition, such approaches are often focused on probabilistic solutions. In this paper, we propose an approach that leverages the potential of datasets spread among a number of data collecting organizations by performing joint analyses in a secure and deterministic manner when only encrypted data is shared and manipulated. The approach is based on secure multiparty computation which refers to cryptographic protocols that enable distributed computation of a function over distributed inputs without revealing additional information about the inputs. It enables multiple organizations to train machine learning models on their joint data and apply the trained models to encrypted data without revealing their sensitive data to the other parties. In our proposed approach, organizations (or sites) securely collaborate to build a machine learning model as it would have been trained on the aggregated data of all the organizations combined. Importantly, the approach does not require a trusted party (i.e. aggregator), each contributing site plays an equal role in the process, and no site can learn individual data of any other site. We demonstrate effectiveness of the proposed approach, in a range of empirical evaluations using different machine learning algorithms including logistic regression and convolutional neural network models on human structural and functional magnetic resonance imaging datasets.
- Subjects :
- Computer science
Process (engineering)
Neuroimaging
Cryptography
Encryption
Machine learning
computer.software_genre
Convolutional neural network
Article
050105 experimental psychology
Field (computer science)
Machine Learning
03 medical and health sciences
0302 clinical medicine
Humans
0501 psychology and cognitive sciences
Computer Security
business.industry
General Neuroscience
05 social sciences
Probabilistic logic
Cryptographic protocol
Secure multi-party computation
Artificial intelligence
business
computer
Algorithms
030217 neurology & neurosurgery
Software
Information Systems
Subjects
Details
- ISSN :
- 15590089 and 15392791
- Volume :
- 20
- Database :
- OpenAIRE
- Journal :
- Neuroinformatics
- Accession number :
- edsair.doi.dedup.....cae3ed30afb3216abf532cfabad23c1f
- Full Text :
- https://doi.org/10.1007/s12021-021-09525-8