Back to Search
Start Over
Do available protein 3D structures reflect human genetic and functional diversity?
- Publication Year :
- 2019
- Publisher :
- Cold Spring Harbor Laboratory, 2019.
-
Abstract
- Genomic databases are substantially biased towards European ancestry populations, and this bias contributes to health disparities. Here, we quantify how well 66,971 experimentally characterized human protein 3D structures represent the diversity of protein sequences observed across the 1000 Genomes Project. More than 85% of available structures do not match a sequence observed in at least one individual, and on average structures match the sequence of 74% of individuals. Nearly 23% of human structures do not matchanyobserved sequences; however, after masking engineered/known mutations, this decreases to ~4%. African ancestry sequences are modestly, but significantly, less likely to be represented by structures (73.5% vs. 74.0%). These differences are mainly driven by the greater genetic diversity of African populations. We identify thousands of variants unrepresented in available structures that influence protein structure and function. Thus, the use of a single structure as representative of “the wild type” protein will often bias results against many individuals. The diversity of protein sequence and structure must be considered to enable accurate, reproducible, and generalizable conclusions from structural analyses.
- Subjects :
- Protein structure and function
0303 health sciences
Genetic diversity
Biology
Genomic databases
03 medical and health sciences
Functional diversity
0302 clinical medicine
Protein sequencing
Evolutionary biology
1000 Genomes Project
10. No inequality
030217 neurology & neurosurgery
030304 developmental biology
Sequence (medicine)
Subjects
Details
- Language :
- English
- Database :
- OpenAIRE
- Accession number :
- edsair.doi.dedup.....f8ad612305fc971d8a0a4990ec39e470
- Full Text :
- https://doi.org/10.1101/637744