Back to Search
Start Over
Characterization of intermediate-sized insertions using whole-genome sequencing data and analysis of their functional impact on gene expression.
- Source :
-
Human genetics [Hum Genet] 2021 Aug; Vol. 140 (8), pp. 1201-1216. Date of Electronic Publication: 2021 May 12. - Publication Year :
- 2021
-
Abstract
- Intermediate-sized insertions are one of the structural variants contributing to genome diversity. However, due to technical difficulties in identifying them, their importance in disease pathogenicity and gene expression regulation remains unclear. We used whole-genome sequencing data of 174 Japanese samples to characterize intermediate-sized insertions using a highly-accurate insertion calling method (IMSindel software and joint-call recovery) and obtained a catalogue of 4254 insertions. We constructed an imputation panel comprising of insertions and SNVs from all samples, and conducted imputation of intermediate-sized insertions for 82 publicly-available Japanese samples. Positive Predictive Value of imputation, evaluated using Nanopore long-read sequencing data, was 97%. Subsequent eQTL analysis predicted 128 (~ 3.0%) insertions as causative for gene expression level changes. Enrichment analysis of causal insertions for genome regulatory elements showed significant associations with CTCF-binding sites, super-enhancers, and promoters. Among 17 causal insertions found in the same causal set with GWAS hits, there were insertions associated with changes in expression of cancer-related genes such as BRCA1, ZNF222, and ABCB10. Analysis of insertions sequences revealed that 461 insertions were short tandem duplications frequently found in early-replicating regions of genome. Furthermore, comparison of functional importance of intermediate-sized insertions with that of intermediate-sized deletions detected in the same sample set in our previous study showed that insertions were more frequent in genic regions, and proportion of functional candidates was smaller in insertions. Here, we characterize a high-confidence set of intermediate-sized insertions and indicate their importance in gene expression regulation. Our results emphasize the importance of considering intermediate-sized insertions in trait association studies.
- Subjects :
- ATP-Binding Cassette Transporters genetics
ATP-Binding Cassette Transporters metabolism
BRCA1 Protein genetics
BRCA1 Protein metabolism
Base Sequence
Binding Sites
CCCTC-Binding Factor genetics
CCCTC-Binding Factor metabolism
Chromosome Mapping
Databases, Genetic
Enhancer Elements, Genetic
Gene Expression Profiling
Humans
Japan
Molecular Sequence Annotation
Neoplasm Proteins metabolism
Neoplasms metabolism
Neoplasms pathology
Osteoarthritis metabolism
Osteoarthritis pathology
Promoter Regions, Genetic
Quantitative Trait Loci
Repressor Proteins genetics
Repressor Proteins metabolism
Software
Whole Genome Sequencing
Gene Expression Regulation
Genome, Human
Mutagenesis, Insertional
Neoplasm Proteins genetics
Neoplasms genetics
Osteoarthritis genetics
Subjects
Details
- Language :
- English
- ISSN :
- 1432-1203
- Volume :
- 140
- Issue :
- 8
- Database :
- MEDLINE
- Journal :
- Human genetics
- Publication Type :
- Academic Journal
- Accession number :
- 33978893
- Full Text :
- https://doi.org/10.1007/s00439-021-02291-2