Back to Search
Start Over
TypeTE: a tool to genotype mobile element insertions from whole genome resequencing data
- Source :
- Nucleic Acids Research
- Publication Year :
- 2019
- Publisher :
- Cold Spring Harbor Laboratory, 2019.
-
Abstract
- Alu retrotransposons account for more than 10% of the human genome, and insertions of these elements create structural variants segregating in human populations. Such polymorphic Alu are powerful markers to understand population structure, and they represent variants that can greatly impact genome function, including gene expression. Accurate genotyping of Alu and other mobile elements has been challenging. Indeed, we found that Alu genotypes previously called for the 1000 Genomes Project are sometimes erroneous, which poses significant problems for phasing these insertions with other variants that comprise the haplotype. To ameliorate this issue, we introduce a new pipeline -- TypeTE -- which genotypes Alu insertions from whole-genome sequencing data. Starting from a list of polymorphic Alus, TypeTE identifies the hallmarks (poly-A tail and target site duplication) and orientation of Alu insertions using local re-assembly to reconstruct presence and absence alleles. Genotype likelihoods are then computed after re-mapping sequencing reads to the reconstructed alleles. Using a ‘gold standard’ set of PCR-based genotyping of >200 loci, we show that TypeTE improves genotype accuracy from 83% to 92% in the 1000 Genomes dataset. TypeTE can be readily adapted to other retrotransposon families and brings a valuable toolbox addition for population genomics.
- Subjects :
- Genotype
AcademicSubjects/SCI00010
0206 medical engineering
Population
Retrotransposon
02 engineering and technology
Computational biology
Biology
Genome
Population genomics
03 medical and health sciences
0302 clinical medicine
Gene Frequency
Databases, Genetic
Genetics
Humans
1000 Genomes Project
education
Genotyping
030304 developmental biology
Narese/7
0303 health sciences
education.field_of_study
Whole Genome Sequencing
Genome, Human
Haplotype
Interspersed Repetitive Sequences
Mutagenesis, Insertional
ComputingMethodologies_PATTERNRECOGNITION
Narese/24
Genetics, Population
Genetic Loci
Narese/28
Methods Online
Human genome
Mobile genetic elements
020602 bioinformatics
030217 neurology & neurosurgery
Software
Subjects
Details
- Language :
- English
- Database :
- OpenAIRE
- Journal :
- Nucleic Acids Research
- Accession number :
- edsair.doi.dedup.....b4f6a82c70ca6fe491dfd96c54b76fb7
- Full Text :
- https://doi.org/10.1101/791665