195 results on '"Interspersed Repeat"'
Search Results
2. Locus-specific chromatin profiling of evolutionarily young transposable elements
- Author
-
Darren Taylor, Robert Lowe, Claude Philippe, Kevin C. L. Cheng, Olivia A. Grant, Nicolae Radu Zabet, Gael Cristofari, Miguel R. Branco, Barts & The London School of Medicine and Dentistry [London, UK] (Blizard Institute), Queen Mary University of London (QMUL), Institut de Recherche sur le Cancer et le Vieillissement (IRCAN), Université Nice Sophia Antipolis (... - 2019) (UNS), COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)-COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Centre National de la Recherche Scientifique (CNRS)-Université Côte d'Azur (UCA), ANR-11-LABX-0028,SIGNALIFE,Réseau d'Innovation sur les Voies de Signalisation en Sciences de la Vie(2011), ANR-16-CE12-0020,RETROMET,Rendre unique l'ADN répété ou comment révéler la régulation épigénétique des rétrotransposons L1 dans les cellules somatiques humaines à une résolution inégalée.(2016), ANR-19-CE12-0032,ImpacTE,Réseau de régulation et élément LINE-1 : impact global des éléments transposables récents sur l'activité génique chez les Mammifères(2019), Université Nice Sophia Antipolis (1965 - 2019) (UNS), Cristofari, Gael, Centres d'excellences - Réseau d'Innovation sur les Voies de Signalisation en Sciences de la Vie - - SIGNALIFE2011 - ANR-11-LABX-0028 - LABX - VALID, Rendre unique l'ADN répété ou comment révéler la régulation épigénétique des rétrotransposons L1 dans les cellules somatiques humaines à une résolution inégalée. - - RETROMET2016 - ANR-16-CE12-0020 - AAPG2016 - VALID, and Réseau de régulation et élément LINE-1 : impact global des éléments transposables récents sur l'activité génique chez les Mammifères - - ImpacTE2019 - ANR-19-CE12-0032 - AAPG2019 - VALID
- Subjects
Transposable element ,[SDV]Life Sciences [q-bio] ,Interspersed repeat ,Locus (genetics) ,[SDV.GEN] Life Sciences [q-bio]/Genetics ,Computational biology ,Biology ,Genome ,03 medical and health sciences ,Mice ,0302 clinical medicine ,[SDV.BBM.GTP]Life Sciences [q-bio]/Biochemistry, Molecular Biology/Genomics [q-bio.GN] ,Genetics ,Animals ,Humans ,Epigenetics ,Epigenomics ,030304 developmental biology ,Regulation of gene expression ,0303 health sciences ,[SDV.GEN]Life Sciences [q-bio]/Genetics ,[SDV.BIBS] Life Sciences [q-bio]/Quantitative Methods [q-bio.QM] ,Genomics ,Repetitive dna ,[SDV.BIBS]Life Sciences [q-bio]/Quantitative Methods [q-bio.QM] ,Chromatin ,[SDV] Life Sciences [q-bio] ,Gene Expression Regulation ,Epigenetics and chromatin ,DNA Transposable Elements ,[SDV.BBM.GTP] Life Sciences [q-bio]/Biochemistry, Molecular Biology/Genomics [q-bio.GN] ,Transposable elements ,030217 neurology & neurosurgery - Abstract
Despite a vast expansion in the availability of epigenomic data, our knowledge of the chromatin landscape at interspersed repeats remains highly limited by difficulties in mapping short-read sequencing data to these regions. In particular, little is known about the locus-specific regulation of evolutionarily young transposable elements (TEs), which have been implicated in genome stability, gene regulation and innate immunity in a variety of developmental and disease contexts. Here we propose an approach for generating locus-specific protein–DNA binding profiles at interspersed repeats, which leverages information on the spatial proximity between repetitive and non-repetitive genomic regions. We demonstrate that the combination of HiChIP and a newly developed mapping tool (PAtChER) yields accurate protein enrichment profiles at individual repetitive loci. Using this approach, we reveal previously unappreciated variation in the epigenetic profiles of young TE loci in mouse and human cells. Insights gained using our method will be invaluable for dissecting the molecular determinants of TE regulation and their impact on the genome.
- Published
- 2021
3. Resolving repeat families with long reads
- Author
-
Philipp Bongartz
- Subjects
Transposable element ,Computer science ,Contiguity ,Interspersed repeat ,Sequence assembly ,Repeat resolution ,Computational biology ,lcsh:Computer applications to medicine. Medical informatics ,Biochemistry ,Genome ,03 medical and health sciences ,0302 clinical medicine ,Repeat families ,Structural Biology ,Databases, Genetic ,Humans ,Molecular Biology ,lcsh:QH301-705.5 ,030304 developmental biology ,0303 health sciences ,Genome assembly ,Contig ,Applied Mathematics ,Methodology Article ,Chromosome ,Sequence Analysis, DNA ,Computer Science Applications ,lcsh:Biology (General) ,030220 oncology & carcinogenesis ,lcsh:R858-859.7 ,DNA microarray ,Algorithms - Abstract
Background Draft quality genomes for a multitude of organisms have become common due to the advancement of genome assemblers using long-read technologies with high error rates. Although current assemblies are substantially more contiguous than assemblies based on short reads, complete chromosomal assemblies are still challenging. Interspersed repeat families with multiple copy versions dominate the contig and scaffold ends of current long-read assemblies for complex genomes. These repeat families generally remain unresolved, as existing algorithmic solutions either do not scale to large copy numbers or can not handle the current high read error rates. Results We propose novel repeat resolution methods for large interspersed repeat families and assess their accuracy on simulated data sets with various distinct repeat structures and on drosophila melanogaster transposons. Additionally, we compare our methods to an existing long read repeat resolution tool and show the improved accuracy of our method. Conclusions Our results demonstrate the applicability of our methods for the improvement of the contiguity of genome assemblies. Electronic supplementary material The online version of this article (10.1186/s12859-019-2807-4) contains supplementary material, which is available to authorized users.
- Published
- 2019
4. SquiggleNet: real-time, direct classification of nanopore signals
- Author
-
Yuwei Bao, Torrin L. McDonald, Robert P. Dickson, Joshua D. Welch, David Blaauw, Jack Wadden, Weichen Zhou, Ryan E. Mills, Piyush Ranjan, Alan P. Boyle, and John R. Erb-Downward
- Subjects
DNA, Bacterial ,QH301-705.5 ,Interspersed repeat ,Respiratory System ,Method ,Sequence alignment ,QH426-470 ,Biology ,Genome ,Raw signal ,Deep Learning ,Classifier (linguistics) ,Genetics ,Humans ,Biology (General) ,Read-until ,business.industry ,Deep learning ,Pattern recognition ,Nanopore ,Nanopore Sequencing ,Long Interspersed Nucleotide Elements ,Oxford Nanopore ,Metagenome ,Base calling ,Nanopore sequencing ,Artificial intelligence ,business ,Real-time - Abstract
We present SquiggleNet, the first deep-learning model that can classify nanopore reads directly from their electrical signals. SquiggleNet operates faster than DNA passes through the pore, allowing real-time classification and read ejection. Using 1 s of sequencing data, the classifier achieves significantly higher accuracy than base calling followed by sequence alignment. Our approach is also faster and requires an order of magnitude less memory than alignment-based approaches. SquiggleNet distinguished human from bacterial DNA with over 90% accuracy, generalized to unseen bacterial species in a human respiratory meta genome sample, and accurately classified sequences containing human long interspersed repeat elements. Supplementary Information The online version contains supplementary material available at (10.1186/s13059-021-02511-y).
- Published
- 2021
5. CGGBP1-regulated cytosine methylation at CTCF-binding motifs resists stochasticity
- Author
-
Divyesh Patel, Subhamoy Datta, Manthan Patel, and Umashankar Singh
- Subjects
0301 basic medicine ,lcsh:QH426-470 ,Interspersed repeat ,Cytosine methylation ,Biology ,Genome ,DNA sequencing ,Cytosine ,chemistry.chemical_compound ,03 medical and health sciences ,0302 clinical medicine ,Transduction, Genetic ,Transcription (biology) ,Gene expression ,Genetics ,Humans ,Allelic imbalance ,CGGBP1 ,Alleles ,Genetics (clinical) ,Transcription factor binding sites ,Stochasticity ,Binding Sites ,Chromosome Mapping ,Sequence Analysis, DNA ,Epigenome ,DNA Methylation ,CTCF ,Ctcf binding ,Cell biology ,DNA-Binding Proteins ,DNA binding site ,lcsh:Genetics ,HEK293 Cells ,030104 developmental biology ,chemistry ,DNA methylation ,DNA ,030217 neurology & neurosurgery ,Research Article - Abstract
The human CGGBP1 is implicated in a variety of cellular functions. It regulates genomic integrity, cell cycle, gene expression and cellular response to growth signals. Evidence suggests that these functions of CGGBP1 manifest through binding to GC-rich regions in the genome and regulation of interspersed repeats. Recent works show that CGGBP1 is needed for cytosine methylation homeostasis and genome-wide occupancy patterns of the epigenome regulator protein CTCF. It has remained unknown if cytosine methylation regulation and CTCF occupancy regulation by CGGBP1 are independent or interdependent processes. By sequencing immunoprecipitated methylated DNA, we have found that some transcription factor-binding sites resist stochastic changes in cytosine methylation. Of these, we have analyzed the CTCF-binding sites thoroughly and show that cytosine methylation regulation at CTCF-binding DNA sequence motifs by CGGBP1 is deterministic. These CTCF-binding sites are positioned at locations where the spread of cytosine methylation in cis depends on the levels of CGGBP1. Our findings suggest that CTCF occupancy and functions are determined by CGGBP1-regulated cytosine methylation patterns.
- Published
- 2020
6. MIR sequences recruit zinc finger protein ZNF768 to expressed genes
- Author
-
Michael Kluge, Caroline C. Friedel, Axel Imhof, Stefan Krebs, Ann Katrin Greifenberg, Muhammad Ahmad Maqbool, Yousra Yahia, Ignasi Forné, Nicolas Descostes, Jean-Christophe Andrau, Dirk Eick, Michaela Rohrmoser, Matthias Geyer, Anita Gruber-Eber, and Helmut Blum
- Subjects
Euchromatin ,Retroelements ,Transcription, Genetic ,Cell Survival ,Interspersed repeat ,Biology ,ELP3 ,03 medical and health sciences ,0302 clinical medicine ,Cell Line, Tumor ,Gene expression ,Genetics ,Humans ,Nucleotide Motifs ,Gene ,030304 developmental biology ,Repetitive Sequences, Nucleic Acid ,Regulation of gene expression ,Zinc finger ,0303 health sciences ,Binding Sites ,Gene regulation, Chromatin and Epigenetics ,DNA ,Cell biology ,Gene Expression Regulation ,Sequence motif ,030217 neurology & neurosurgery ,Transcription Factors - Abstract
Mammalian-wide interspersed repeats (MIRs) are retrotransposed elements of mammalian genomes. Here, we report the specific binding of zinc finger protein ZNF768 to the sequence motif GCTGTGTG (N20) CCTCTCTG in the core region of MIRs. ZNF768 binding is preferentially associated with euchromatin and promoter regions of genes. Binding was observed for genes expressed in a cell type-specific manner in human B cell line Raji and osteosarcoma U2OS cells. Mass spectrometric analysis revealed binding of ZNF768 to Elongator components Elp1, Elp2 and Elp3 and other nuclear factors. The N-terminus of ZNF768 contains a heptad repeat array structurally related to the C-terminal domain (CTD) of RNA polymerase II. This array evolved in placental animals but not marsupials and monotreme species, displays species-specific length variations, and possibly fulfills CTD related functions in gene regulation. We propose that the evolution of MIRs and ZNF768 has extended the repertoire of gene regulatory mechanisms in mammals and that ZNF768 binding is associated with cell type-specific gene expression.
- Published
- 2018
7. Genomic Organization of TBK1 Copy Number Variations in Glaucoma Patients
- Author
-
Young H. Kwon, Robert Ritch, Alan L. Robin, Edwin M. Stone, Wallace L.M. Alward, Todd E. Scheetz, Adam P. DeLuca, John H. Fingert, Kazuhide Kawase, and Jeffrey M. Liebmann
- Subjects
Male ,0301 basic medicine ,DNA Copy Number Variations ,DNA Mutational Analysis ,Interspersed repeat ,Alu element ,Protein Serine-Threonine Kinases ,Gene dosage ,Article ,03 medical and health sciences ,0302 clinical medicine ,Humans ,Medicine ,Low Tension Glaucoma ,Copy-number variation ,Intraocular Pressure ,Chromosome 12 ,Genomic organization ,Genetics ,business.industry ,Chromosome ,DNA ,Interspersed Repetitive Sequences ,Pedigree ,Ophthalmology ,030104 developmental biology ,Mutation ,030221 ophthalmology & optometry ,Female ,business - Abstract
BACKGROUND Approximately 1% of normal tension glaucoma (NTG) cases are caused by TANK-binding kinase 1 (TBK1) gene duplications and triplications. However, the precise borders and orientation of these TBK1 gene copy number variations (CNVs) on chromosome 12 are unknown. METHODS We determined the exact borders of TBK1 CNVs and the orientation of duplicated or triplicated DNA segments in 5 NTG patients with different TBK1 mutations using whole-genome sequencing. RESULTS Tandemly duplicated chromosome segments spanning the TBK1 gene were detected in 4 NTG patients, each with unique borders. Four of 5 CNVs had borders located within interspersed repetitive DNA sequences (Alu and long interspersed nuclear element-L1 elements), suggesting that mismatched homologous recombinations likely generated these CNVs. A fifth NTG patient had a complex rearrangement including triplication of a chromosome segment spanning the TBK1 gene. CONCLUSIONS No specific mutation hotspots for TBK1 CNVs were detected, however, interspersed repetitive sequences (ie, Alu elements) were identified at the borders of TBK1 CNVs, which suggest that mismatch of these elements during meiosis may be the mechanism that generated TBK1 gene dosage mutations.
- Published
- 2017
8. Whole-genome expression analysis of mammalian-wide interspersed repeat elements in human cell lines
- Author
-
Anastasia Conti, Davide Carnevali, Giorgio Dieci, and Matteo Pellegrini
- Subjects
0301 basic medicine ,RNA polymerase III ,Retroelements ,Transcription, Genetic ,1.1 Normal biological development and functioning ,Interspersed repeat ,Plant Biology & Botany ,Computational biology ,Biology ,ENCODE ,Genome ,03 medical and health sciences ,chemistry.chemical_compound ,Genetic ,Transcription (biology) ,RNA polymerase ,Genetics ,2.1 Biological and endogenous factors ,Humans ,RNA-Seq ,Molecular Biology ,Gene ,Sequence Analysis, RNA ,Gene Expression Profiling ,Human Genome ,Computational Biology ,General Medicine ,mammalian-wide interspersed repeats ,Full Papers ,SINE ,Interspersed Repetitive Sequences ,030104 developmental biology ,chemistry ,Hela Cells ,RNA ,Human genome ,Generic health relevance ,Sequence Analysis ,Transcription ,Overlapping gene ,Biotechnology ,HeLa Cells ,Plasmids - Abstract
With more than 500,000 copies, mammalian-wide interspersed repeats (MIRs), a sub-group of SINEs, represent ∼2.5% of the human genome and one of the most numerous family of potential targets for the RNA polymerase (Pol) III transcription machinery. Since MIR elements ceased to amplify ∼130 myr ago, previous studies primarily focused on their genomic impact, while the issue of their expression has not been extensively addressed. We applied a dedicated bioinformatic pipeline to ENCODE RNA-Seq datasets of seven human cell lines and, for the first time, we were able to define the Pol III-driven MIR transcriptome at single-locus resolution. While the majority of Pol III-transcribed MIR elements are cell-specific, we discovered a small set of ubiquitously transcribed MIRs mapping within Pol II-transcribed genes in antisense orientation that could influence the expression of the overlapping gene. We also identified novel Pol III-transcribed ncRNAs, deriving from transcription of annotated MIR fragments flanked by unique MIR-unrelated sequences, and confirmed the role of Pol III-specific internal promoter elements in MIR transcription. Besides demonstrating widespread transcription at these retrotranspositionally inactive elements in human cells, the ability to profile MIR expression at single-locus resolution will facilitate their study in different cell types and states including pathological alterations.
- Published
- 2016
9. CGGBP1 regulates CTCF occupancy at repeats
- Author
-
Subhamoy Datta, Manthan Patel, Divyesh Patel, and Umashankar Singh
- Subjects
CCCTC-Binding Factor ,lcsh:QH426-470 ,Interspersed repeat ,Cell Line ,Histones ,03 medical and health sciences ,0302 clinical medicine ,Genetics ,Humans ,RNA, Small Interfering ,Molecular Biology ,030304 developmental biology ,Cell Nucleus ,Regulation of gene expression ,0303 health sciences ,Binding Sites ,biology ,Research ,Chromatin ,Cell biology ,ChIP-sequencing ,DNA-Binding Proteins ,lcsh:Genetics ,Histone ,CTCF ,030220 oncology & carcinogenesis ,DNA methylation ,biology.protein ,H3K4me3 ,RNA Interference ,Protein Processing, Post-Translational - Abstract
Background CGGBP1 is a repeat-binding protein with diverse functions in the regulation of gene expression, cytosine methylation, repeat silencing and genomic integrity. CGGBP1 has also been identified as a cooperator of histone-modifying enzymes and as a component of CTCF-containing complexes that regulate the enhancer–promoter looping. CGGBP1–CTCF cross talk in chromatin regulation has been hitherto unknown. Results Here, we report that the occupancy of CTCF at repeats depends on CGGBP1. Using ChIP-sequencing for CTCF, we describe its occupancy at repetitive DNA. Our results show that endogenous level of CGGBP1 ensures CTCF occupancy preferentially on repeats over canonical CTCF motifs. By combining CTCF ChIP-sequencing results with ChIP sequencing for three different kinds of histone modifications (H3K4me3, H3K9me3 and H3K27me3), we show that the CGGBP1-dependent repeat-rich CTCF-binding sites regulate histone marks in flanking regions. Conclusion CGGBP1 affects the pattern of CTCF occupancy. Our results posit CGGBP1 as a regulator of CTCF and its binding sites in interspersed repeats.
- Published
- 2019
10. Transcriptome Analysis of Recurrently Deregulated Genes across Multiple Cancers Identifies New Pan-Cancer Biomarkers
- Author
-
Hideya Kawaji, Masayoshi Itoh, Yuji Tanaka, Piero Carninci, Albin Sandelin, Alistair R. R. Forrest, Yoshihide Hayashizaki, Timo Lassmann, Bogumil Kaczkowski, and Robin Andersson
- Subjects
0301 basic medicine ,Regulation of gene expression ,Genetics ,Cancer Research ,Gene Expression Profiling ,Interspersed repeat ,Promoter ,Computational biology ,Biology ,Cap analysis gene expression ,Gene Expression Regulation, Neoplastic ,Gene expression profiling ,Transcriptome ,03 medical and health sciences ,030104 developmental biology ,Oncology ,Cell Line, Tumor ,Neoplasms ,Biomarkers, Tumor ,Humans ,Enhancer ,Gene - Abstract
Genes that are commonly deregulated in cancer are clinically attractive as candidate pan-diagnostic markers and therapeutic targets. To globally identify such targets, we compared Cap Analysis of Gene Expression profiles from 225 different cancer cell lines and 339 corresponding primary cell samples to identify transcripts that are deregulated recurrently in a broad range of cancer types. Comparing RNA-seq data from 4,055 tumors and 563 normal tissues profiled in the The Cancer Genome Atlas and FANTOM5 datasets, we identified a core transcript set with theranostic potential. Our analyses also revealed enhancer RNAs, which are upregulated in cancer, defining promoters that overlap with repetitive elements (especially SINE/Alu and LTR/ERV1 elements) that are often upregulated in cancer. Lastly, we documented for the first time upregulation of multiple copies of the REP522 interspersed repeat in cancer. Overall, our genome-wide expression profiling approach identified a comprehensive set of candidate biomarkers with pan-cancer potential, and extended the perspective and pathogenic significance of repetitive elements that are frequently activated during cancer progression. Cancer Res; 76(2); 216–26. ©2015 AACR.
- Published
- 2016
11. Characterization of contiguous gene deletions in COL4A6 and COL4A5 in Alport syndrome-diffuse leiomyomatosis
- Author
-
Shogo Minamikawa, China Nagano, Tomohiko Yamamura, Kandai Nozu, Motoko Yanagita, Koichi Nakanishi, Hiroshi Kaito, Takeshi Ninchoji, Eihiko Takahashi, Kazumoto Iijima, Yoshimitsu Gotoh, Naoya Morisada, Shuichiro Fujinaga, Ichiro Morioka, Takahiro Morishita, Masafumi Oka, Shiro Yamada, and Igor Vorechovsky
- Subjects
Collagen Type IV ,0301 basic medicine ,medicine.medical_specialty ,Interspersed repeat ,030232 urology & nephrology ,Nephritis, Hereditary ,Retrotransposon ,Biology ,03 medical and health sciences ,0302 clinical medicine ,Leiomyomatosis ,otorhinolaryngologic diseases ,Genetics ,medicine ,Humans ,Alport syndrome ,Genetics (clinical) ,Base Sequence ,Breakpoint ,Cytogenetics ,medicine.disease ,Molecular biology ,030104 developmental biology ,Human genome ,Homologous recombination ,Gene Deletion - Abstract
Alport syndrome-diffuse leiomyomatosis (AS-DL, OMIM: 308940) is a rare variant of the X-linked Alport syndrome that shows overgrowth of visceral smooth muscles in the gastrointestinal, respiratory and female reproductive tracts in addition to renal symptoms. AS-DL results from deletions that encompass the 5′ ends of the COL4A5 and COL4A6 genes, but deletion breakpoints between COL4A5 and COL4A6 have been determined in only four cases. Here, we characterize deletion breakpoints in five AS-DL patients and show a contiguous COL4A6/COL4A5 deletion in each case. We also demonstrate that eight out of nine deletion alleles involved sequences homologous between COL4A5 and COL4A6. Most breakpoints took place in recognizable transposed elements, including long and short interspersed repeats, DNA transposons and long-terminal repeat retrotransposons. Because deletions involved the bidirectional promoter region in each case, we suggest that the occurrence of leiomyomatosis in AS-DL requires inactivation of both genes. Altogether, our study highlights the importance of homologous recombination involving multiple transposed elements for the development of this continuous gene syndrome and other atypical loss-of-function phenotypes.
- Published
- 2017
12. Lineage tracing using a Cas9-deaminase barcoding system targeting endogenous L1 elements
- Author
-
Goo Jang, Yujin Jeon, Wookjae Lee, Namjin Cho, Duhee Bang, Byungjin Hwang, and Soo Young Yum
- Subjects
0301 basic medicine ,Genome evolution ,Science ,Interspersed repeat ,General Physics and Astronomy ,Mutagenesis (molecular biology technique) ,Genomics ,02 engineering and technology ,Computational biology ,Biology ,Cell fate determination ,medicine.disease_cause ,Time-Lapse Imaging ,General Biochemistry, Genetics and Molecular Biology ,Article ,03 medical and health sciences ,Genome editing ,CRISPR-Associated Protein 9 ,Cytidine Deaminase ,medicine ,DNA Barcoding, Taxonomic ,Humans ,Cell Lineage ,lcsh:Science ,Gene Editing ,Mutation ,Multidisciplinary ,Cas9 ,Cell Differentiation ,General Chemistry ,Cytidine deaminase ,021001 nanoscience & nanotechnology ,030104 developmental biology ,HEK293 Cells ,Long Interspersed Nucleotide Elements ,Mutagenesis ,lcsh:Q ,Single-Cell Analysis ,0210 nano-technology ,HeLa Cells ,RNA, Guide, Kinetoplastida - Abstract
Determining cell lineage and function is critical to understanding human physiology and pathology. Although advances in lineage tracing methods provide new insight into cell fate, defining cellular diversity at the mammalian level remains a challenge. Here, we develop a genome editing strategy using a cytidine deaminase fused with nickase Cas9 (nCas9) to specifically target endogenous interspersed repeat regions in mammalian cells. The resulting mutation patterns serve as a genetic barcode, which is induced by targeted mutagenesis with single-guide RNA (sgRNA), leveraging substitution events, and subsequent read out by a single primer pair. By analyzing interspersed mutation signatures, we show the accurate reconstruction of cell lineage using both bulk cell and single-cell data. We envision that our genetic barcode system will enable fine-resolution mapping of organismal development in healthy and diseased mammalian states., Lineage tracing has provided new insights into cell fate but defining cellular diversity remains a challenge. Here the authors target endogenous repeat regions in mammalian cells with cytidine deaminase fused to nCas9 to create genetic barcodes for fine-resolution mapping.
- Published
- 2018
13. TranSurVeyor: an improved database-free algorithm for finding non-reference transpositions in high-throughput sequencing data
- Author
-
Ramesh Rajaby and Wing-Kin Sung
- Subjects
0301 basic medicine ,Databases, Factual ,Interspersed repeat ,Transposition (telecommunications) ,Biology ,computer.software_genre ,Genome ,DNA sequencing ,03 medical and health sciences ,Genetics ,Humans ,Cluster analysis ,Database ,Genome, Human ,Computational Biology ,High-Throughput Nucleotide Sequencing ,Reproducibility of Results ,Genomics ,Mutagenesis, Insertional ,030104 developmental biology ,Filter (video) ,DNA Transposable Elements ,Methods Online ,Human genome ,computer ,Algorithms ,Reference genome - Abstract
Transpositions transfer DNA segments between different loci within a genome; in particular, when a transposition is found in a sample but not in a reference genome, it is called a non-reference transposition. They are important structural variations that have clinical impact. Transpositions can be called by analyzing second generation high-throughput sequencing datasets. Current methods follow either a database-based or a database-free approach. Database-based methods require a database of transposable elements. Some of them have good specificity; however this approach cannot detect novel transpositions, and it requires a good database of transposable elements, which is not yet available for many species. Database-free methods perform de novo calling of transpositions, but their accuracy is low. We observe that this is due to the misalignment of the reads; since reads are short and the human genome has many repeats, false alignments create false positive predictions while missing alignments reduce the true positive rate. This paper proposes new techniques to improve database-free non-reference transposition calling: first, we propose a realignment strategy called one-end remapping that corrects the alignments of reads in interspersed repeats; second, we propose a SNV-aware filter that removes some incorrectly aligned reads. By combining these two techniques and other techniques like clustering and positive-to-negative ratio filter, our proposed transposition caller TranSurVeyor shows at least 3.1-fold improvement in terms of F1-score over existing database-free methods. More importantly, even though TranSurVeyor does not use databases of prior information, its performance is at least as good as existing database-based methods such as MELT, Mobster and Retroseq. We also illustrate that TranSurVeyor can discover transpositions that are not known in the current database.
- Published
- 2018
14. Association of in vitro fertilization with global and IGF2/H19 methylation variation in newborn twins
- Author
-
Richard Saffery, Yuk Jing Loke, John C. Galati, and Jeffrey M. Craig
- Subjects
Adult ,Male ,Offspring ,medicine.medical_treatment ,Interspersed repeat ,Twins ,Medicine (miscellaneous) ,Fertilization in Vitro ,Biology ,Intracytoplasmic sperm injection ,Epigenesis, Genetic ,Insulin-Like Growth Factor II ,Pregnancy ,medicine ,Humans ,Epigenetics ,reproductive and urinary physiology ,Genetics ,In vitro fertilisation ,Infant, Newborn ,Methylation ,DNA Methylation ,Twin study ,female genital diseases and pregnancy complications ,embryonic structures ,DNA methylation ,Female ,RNA, Long Noncoding - Abstract
In vitro fertilization (IVF) and its subset intracytoplasmic sperm injection (ICSI), are widely used medical treatments for conception. There has been controversy over whether IVF is associated with adverse short- and long-term health outcomes of offspring. As with other prenatal factors, epigenetic change is thought to be a molecular mediator of any in utero programming effects. Most studies focused on DNA methylation at gene-specific and genomic level, with only a few on associations between DNA methylation and IVF. Using buccal epithelium from 208 twin pairs from the Peri/Postnatal Epigenetic Twin Study (PETS), we investigated associations between IVF and DNA methylation on a global level, using the proxies of Alu and LINE-1 interspersed repeats in addition to two locus-specific regulatory regions within IGF2/H19, controlling for 13 potentially confounding factors. Using multiple correction testing, we found strong evidence that IVF-conceived twins have lower DNA methylation in Alu, and weak evidence of lower methylation in one of the two IGF2/H19 regulatory regions and LINE-1, compared with naturally conceived twins. Weak evidence of a relationship between ICSI and DNA methylation within IGF2/H19 regulatory region was found, suggesting that one or more of the processes associated with IVF/ICSI may contribute to these methylation differences. Lower within- and between-pair DNA methylation variation was also found in IVF-conceived twins for LINE-1, Alu and one IGF2/H19 regulatory region. Although larger sample sizes are needed, our results provide additional insight to the possible influence of IVF and ICSI on DNA methylation. To our knowledge, this is the largest study to date investigating the association of IVF and DNA methylation.
- Published
- 2015
15. Precise mapping of 17 deletion breakpoints within the central hotspot deletion region (introns 50 and 51) of the DMD gene
- Author
-
Igor Cm Tandurella, Gabriella Esposito, Maria Savarese, Antonella Carsana, Maria Roberta Tremolaterra, Tiziana Fioretti, Evelina Marsocci, Esposito, Gabriella, Tremolaterra, MARIA ROBERTA, Marsocci, Evelina, Tandurella, Ic, Fioretti, Tiziana, Savarese, Maria, and Carsana, Antonella
- Subjects
musculoskeletal diseases ,0301 basic medicine ,Interspersed repeat ,Chromosome Breakpoints ,Biology ,Genome ,Homology (biology) ,Dystrophin ,03 medical and health sciences ,Exon ,0302 clinical medicine ,deletion, breakpoints, DMD gene ,Genetics ,Humans ,Deletion mapping ,Genetics (clinical) ,Sequence Deletion ,Breakpoint ,Intron ,Chromosome Mapping ,DNA ,Exons ,Introns ,Muscular Dystrophy, Duchenne ,030104 developmental biology ,Phenotype ,030217 neurology & neurosurgery - Abstract
Exon deletions in the human DMD gene, which encodes the dystrophin protein, are the molecular defect in 50-70% of cases of Duchenne/Becker muscular dystrophies. Deletions are preferentially clustered in the 5' (exons 2-20) and the central (exons 45-53) region of DMD, likely because local DNA structure predisposes to specific breakage or recombination events. Notably, innovative therapeutic strategies may rescue dystrophin function by homology-based specific targeting of sequences within the central DMD hot spot deletion region. To further study molecular mechanisms that generate such frequent genome variations and to identify residual intronic sequences, we sequenced 17 deletion breakpoints within introns 50 and 51 of DMD and analyzed the surrounding genomic architecture. There was no breakpoint clustering within the introns nor extensive homology between sequences adjacent to each junction. However, at or near the breakpoint, we found microhomology, short tandem repeats, interspersed repeat elements and short sequence stretches that predispose to DNA deletion or bending. Identification of such structural elements contributes to elucidate general mechanisms generating deletion within the DMD gene. Moreover, precise mapping of deletion breakpoints and localization of repeated elements are of interest, because residual intronic sequences may be targeted by therapeutic strategies based on genome editing correction.Journal of Human Genetics advance online publication, 7 September 2017; doi:10.1038/jhg.2017.84.
- Published
- 2017
16. Human transposon insertion profiling: Analysis, visualization and identification of somatic LINE-1 insertions in ovarian cancer
- Author
-
Jef D. Boeke, Nemanja Rodić, Cheng Ran Lisa Huang, Zuojian Tang, Mark Grivainis, Ie Ming Shih, David Fenyö, Kathleen H. Burns, Sisi Ma, Tian Li Wang, and Jared P. Steranka
- Subjects
0301 basic medicine ,Transposable element ,Genetics ,Ovarian Neoplasms ,Multidisciplinary ,Somatic cell ,Genome, Human ,Interspersed repeat ,Retrotransposon ,Biology ,Genome ,Polymerase Chain Reaction ,Structural variation ,Machine Learning ,03 medical and health sciences ,030104 developmental biology ,Germline mutation ,Long Interspersed Nucleotide Elements ,PNAS Plus ,Humans ,Human genome ,Female ,Algorithms - Abstract
Mammalian genomes are replete with interspersed repeats reflecting the activity of transposable elements. These mobile DNAs are self-propagating, and their continued transposition is a source of both heritable structural variation as well as somatic mutation in human genomes. Tailored approaches to map these sequences are useful to identify insertion alleles. Here, we describe in detail a strategy to amplify and sequence long interspersed element-1 (LINE-1, L1) retrotransposon insertions selectively in the human genome, transposon insertion profiling by next-generation sequencing (TIPseq). We also report the development of a machine-learning-based computational pipeline, TIPseqHunter, to identify insertion sites with high precision and reliability. We demonstrate the utility of this approach to detect somatic retrotransposition events in high-grade ovarian serous carcinoma.
- Published
- 2017
17. Xp22.31 Microdeletion due to Microhomology-Mediated Break-Induced Replication in a Boy with Contiguous Gene Deletion Syndrome
- Author
-
Akira Ishiguro, Miki Kamimura, Satoshi Narumi, Maki Fukami, Hirohito Shima, Ikuma Fujiwara, Shigeo Kure, Erina Suzuki, Koki Nagai, and Junko Kanno
- Subjects
0301 basic medicine ,DNA Replication ,Male ,DNA End-Joining Repair ,Interspersed repeat ,Non-allelic homologous recombination ,Alu element ,Nerve Tissue Proteins ,Biology ,03 medical and health sciences ,Genetics ,medicine ,Humans ,DNA Breaks, Double-Stranded ,Copy-number variation ,Homologous Recombination ,Molecular Biology ,Genetics (clinical) ,X chromosome ,Chromosomes, Human, X ,Comparative Genomic Hybridization ,Extracellular Matrix Proteins ,Base Sequence ,Ichthyosis ,Breakpoint ,Infant ,Syndrome ,medicine.disease ,Molecular biology ,030104 developmental biology ,Steryl-Sulfatase ,Chromosome Deletion ,GC-content ,Gene Deletion - Abstract
The Xp22.31 region is characterized by a low frequency of interspersed repeats and a low GC content. Submicroscopic deletions at Xp22.31 involving STS and ANOS1 (alias KAL1) underlie X-linked ichthyosis and Kallmann syndrome, respectively. Of the known microdeletions at Xp22.31, a common approximately 1.5-Mb deletion encompassing STS was ascribed to nonallelic homologous recombination, while 2 ANOS1-containing deletions were attributed to nonhomologous end-joining. However, the genomic bases of other microdeletions within the Xp22.31 region remain to be elucidated. Here, we identified a 2,735,696-bp deletion encompassing STS and ANOS1 in a boy with X-linked ichthyosis and Kallmann syndrome. The breakpoints of the deletion were located within Alu repeats and shared 2-bp microhomology. The fusion junction was not associated with nucleotide stretches, and the breakpoint-flanking regions harbored no palindromes or noncanonical DNA motifs. These results indicate that microhomology-mediated break-induced replication (MMBIR) can cause deletions at Xp22.31, resulting in contiguous gene deletion syndrome. It appears that interspersed repeats without other known rearrangement-inducing DNA features or high GC contents are sufficient to stimulate MMBIR at Xp22.31.
- Published
- 2016
18. Repression of Stress-Induced LINE-1 Expression Protects Cancer Cell Subpopulations from Lethal Drug Exposure
- Author
-
Marie Classon, Gulfem D. Guler, Matthew Wongchenko, Joshua D. Webster, Erica L. Jackson, David Arnott, Tracy Leah Nance, Catherine Wilson, Tommy K. Cheung, Jinfeng Liu, Navneet Alag, Suchit Jhunjhunwala, Benjamin Haley, Ganapati V. Hegde, Katrina Nichols, Yibing Yan, Kuan-Bei Chen, Charles Tindell, Trinna L. Cuellar, Paul G. Giresi, Hyojin Kim, Jean-Philippe Stephan, Jeff Settleman, and Robert M. Pitti
- Subjects
0301 basic medicine ,Cancer Research ,Cell ,Population ,Interspersed repeat ,Mice, Nude ,Antineoplastic Agents ,Mice, SCID ,Biology ,Epigenetic Repression ,Methylation ,Genomic Instability ,Histones ,03 medical and health sciences ,Histone H3 ,Mice ,Stress, Physiological ,Neoplasms ,medicine ,Tumor Cells, Cultured ,Animals ,Humans ,education ,Psychological repression ,education.field_of_study ,Cell Biology ,Histone-Lysine N-Methyltransferase ,Xenograft Model Antitumor Assays ,Chromatin ,Gene Expression Regulation, Neoplastic ,030104 developmental biology ,medicine.anatomical_structure ,Long Interspersed Nucleotide Elements ,Oncology ,Drug Resistance, Neoplasm ,Cancer cell ,Cancer research - Abstract
Maintenance of phenotypic heterogeneity within cell populations is an evolutionarily conserved mechanism that underlies population survival upon stressful exposures. We show that the genomes of a cancer cell subpopulation that survives treatment with otherwise lethal drugs, the drug-tolerant persisters (DTPs), exhibit a repressed chromatin state characterized by increased methylation of histone H3 lysines 9 and 27 (H3K9 and H3K27). We also show that survival of DTPs is, in part, maintained by regulators of H3K9me3-mediated heterochromatin formation and that the observed increase in H3K9me3 in DTPs is most prominent over long interspersed repeat element 1 (LINE-1). Disruption of the repressive chromatin over LINE-1 elements in DTPs results in DTP ablation, which is partially rescued by reducing LINE-1 expression or function.
- Published
- 2016
19. (AT)n is an interspersed repeat in the Xenopus genome
- Author
-
R K Patient and David R. Greaves
- Subjects
Interspersed repeat ,Xenopus ,Genome ,General Biochemistry, Genetics and Molecular Biology ,Histones ,Xenopus laevis ,chemistry.chemical_compound ,Animals ,Humans ,Globin ,Molecular Biology ,Gene ,Repetitive Sequences, Nucleic Acid ,Genetics ,Base Sequence ,General Immunology and Microbiology ,biology ,General Neuroscience ,biology.organism_classification ,Molecular biology ,Globins ,Histone ,chemistry ,biology.protein ,Ploidy ,DNA ,Research Article - Abstract
We have observed (AT)34 and (AT)23 tracts close to the coding sequences of the Xenopus laevis tadpole alpha T1 and adult beta 1 globin genes, respectively. We show that (AT)n sequences are found as interspersed repeats within the Xenopus globin and histone gene loci. Using (AT)n co-polymer in filter hybridisation experiments we estimate that there are 10(4) (AT)n tracts per haploid Xenopus genome. Hybridisation to genomic blots of DNA from yeast, slime mold, trypanosome, fruit fly, salmon, chicken, rat, human, crab and Xenopus species shows that strictly alternating AT of sufficient length to hybridise appears to be most abundant in Xenopus and crab genomes. We show that the specificity of the co-polymer probe for strictly alternating AT is, however, dependent on the length of the probe. Hybridisation experiments using (TG)n copolymer suggest that this highly conserved repeat is found as clustered repeats in the Xenopus genome in contrast to other eukaryotic genomes so far studied.
- Published
- 2016
20. Retrotransposons as regulators of gene expression
- Author
-
Lynne E. Maquat, Bronwyn A. Lucas, and Reyad A. Elbarbary
- Subjects
0301 basic medicine ,Transposable element ,Genome evolution ,Transcription, Genetic ,RNA Stability ,Interspersed repeat ,Retrotransposon ,Computational biology ,Biology ,Genome ,behavioral disciplines and activities ,Article ,Evolution, Molecular ,03 medical and health sciences ,Mice ,RNA Precursors ,Short Interspersed Nucleotide Elements ,Animals ,Humans ,Disease ,RNA, Messenger ,RNA Processing, Post-Transcriptional ,Genetics ,Regulation of gene expression ,Multidisciplinary ,Chromatin ,Long interspersed nuclear element ,030104 developmental biology ,Long Interspersed Nucleotide Elements ,Gene Expression Regulation ,Protein Biosynthesis ,psychological phenomena and processes - Abstract
Parasitic DNAs help and hinder evolution Transposable elements are parasitic DNAs that can duplicate themselves and jump around their host genomes. They can both disrupt gene function and drive genome evolution. Elbarbary et al. review the roles of two classes of transposable elements in gene regulation and disease: long interspersed elements (LINEs) and short interspersed elements (SINEs). Roughly a third of the human genome consists of LINEs and SINEs. They contribute to a broad range of important genome and gene regulatory features, while at the same time being responsible for number of human diseases. Science , this issue p. 10.1126/science.aac7247
- Published
- 2016
21. Active human retrotransposons: variation and disease
- Author
-
Haig H. Kazazian and Dustin C. Hancks
- Subjects
Time Factors ,Transcription, Genetic ,Pseudogene ,Interspersed repeat ,Alu element ,Retrotransposon ,Biology ,Article ,Open Reading Frames ,Alu Elements ,Untranslated Regions ,Genetics ,Humans ,Promoter Regions, Genetic ,Transposons as a genetic tool ,Genome, Human ,Genetic Diseases, Inborn ,Genetic Variation ,DNA Methylation ,Noncoding DNA ,Long interspersed nuclear element ,Long Interspersed Nucleotide Elements ,DNA Transposable Elements ,RNA ,Human genome ,Pseudogenes ,Developmental Biology - Abstract
Mobile DNAs, also known as transposons or 'jumping genes', are widespread in nature and comprise an estimated 45% of the human genome. Transposons are divided into two general classes based on their transposition intermediate (DNA or RNA). Only one subclass, the non-LTR retrotransposons, which includes the Long INterspersed Element-1 (LINE-1 or L1), is currently active in humans as indicated by 96 disease-causing insertions. The autonomous LINE-1 is capable of retrotransposing not only a copy of its own RNA in cis but also other RNAs (Alu, SINE-VNTR-Alu (SVA), U6) in trans to new genomic locations through an element encoded reverse transcriptase. L1 can also retrotranspose cellular mRNAs, resulting in processed pseudogene formation. Here, we highlight recent reports that update our understanding of human L1 retrotransposition and their role in disease. Finally we discuss studies that provide insights into the past and current activity of these retrotransposons, and shed light on not just when, but where, retrotransposition occurs and its part in genetic variation.
- Published
- 2012
22. Genomic relationship between SINE retrotransposons, Pol III–Pol II transcription, and chromatin organization: the journey from junk to jewel
- Author
-
Michelle Atallah and Victoria V. Lunyak
- Subjects
Genetics ,Transposable element ,Genome ,Retroelements ,Interspersed repeat ,Retrotransposon ,DNA Polymerase II ,Cell Biology ,Biology ,Biochemistry ,Article ,Chromatin ,Long terminal repeat ,RNA polymerase III ,Long interspersed nuclear element ,Mice ,Animals ,Humans ,Short Interspersed Nucleotide Elements ,Molecular Biology ,DNA Polymerase III - Abstract
A typical eukaryotic genome harbors a rich variety of repetitive elements. The most abundant are retrotransposons, mobile retroelements that utilize reverse transcriptase and an RNA intermediate to relocate to a new location within the cellular genomes. A vast majority of the repetitive mammalian genome content has originated from the retrotransposition of SINE (100–300 bp short interspersed nuclear elements that are derived from the structural 7SL RNA or tRNA), LINE (7kb long interspersed nuclear element), and LTR (2–3 kb long terminal repeats) transposable element superfamilies. Broadly labeled as “evolutionary junkyard” or “fossils”, this enigmatic “dark matter” of the genome possesses many yet to be discovered properties.
- Published
- 2011
23. LINE-1 Elements in Structural Variation and Disease
- Author
-
Richard M. Badge, Christine R. Beck, Jose L. Garcia-Perez, and John V. Moran
- Subjects
Genetics ,Genome evolution ,Retroelements ,Genome, Human ,Interspersed repeat ,Genetic Variation ,Alu element ,Retrotransposon ,Bacterial genome size ,Biology ,Genome ,Article ,Long Interspersed Nucleotide Elements ,Evolutionary biology ,Humans ,Disease ,Human genome ,Molecular Biology ,Genetics (clinical) ,Reference genome - Abstract
The completion of the human genome reference sequence ushered in a new era for the study and discovery of human transposable elements. It now is undeniable that transposable elements, historically dismissed as junk DNA, have had an instrumental role in sculpting the structure and function of our genomes. In particular, long interspersed element-1 (LINE-1 or L1) and short interspersed elements (SINEs) continue to affect our genome, and their movement can lead to sporadic cases of disease. Here, we briefly review the types of transposable elements present in the human genome and their mechanisms of mobility. We next highlight how advances in DNA sequencing and genomic technologies have enabled the discovery of novel retrotransposons in individual genomes. Finally, we discuss how L1-mediated retrotransposition events impact human genomes.
- Published
- 2011
24. Parallelism in Evolution of Highly Repetitive DNAs in Sibling Species
- Author
-
Brankica Mravinac and Miroslav Plohl
- Subjects
Male ,Heterochromatin ,Satellite DNA ,Molecular Sequence Data ,Interspersed repeat ,Genes, Insect ,Genomics ,satellite DNA ,interspersed repeats ,sibling species ,Tribolium ,heterochromatin ,evolution ,Biology ,Genome ,Genetics ,Animals ,Humans ,Repeated sequence ,Molecular Biology ,In Situ Hybridization, Fluorescence ,Phylogeny ,Ecology, Evolution, Behavior and Systematics ,Repetitive Sequences, Nucleic Acid ,Sequence (medicine) ,Base Sequence ,Nucleic acid sequence ,DNA ,Sequence Analysis, DNA ,Multigene Family ,Sequence Alignment - Abstract
Characterization of heterochromatin in the flour beetle Tribolium audax revealed two highly repetitive DNA families, named TAUD1 and TAUD2, which together constitute almost 60% of the whole genome. Both families originated from a common ancestral approximately 110-bp repeating unit. Tandem arrangement of these elements in TAUD1 is typical for satellite DNAs, whereas TAUD2 represents a dispersed family based on 1412-bp complex higher-order repeats composed of inversely oriented approximately 110 bp units. Comparison with repetitive DNAs in the sibling species Tribolium madens showed similarities in nucleotide sequence and length of basic repeating units and also revealed structural and organizational parallelism in tandem and dispersed families assembled from these elements. In both Tribolium species, one tandem and one dispersed family build equivalent distribution patterns in the pericentromeric heterochromatin of all chromosomes including supernumeraries. Differences in the nucleotide sequence and in the complexity of higher-order structures between families of the same type suggest a scenario according to which rearranged variants of the corresponding ancestral families were formed and distributed in genomes during or after the speciation event, following the same principles independently in each descendant species. We assume that random effects of sequence dynamics should be constrained by organizational and structural features of repeating units and possible requirements for spatial distribution of particular sequence elements. An interspersed pattern of repetitive families also points to the intensive recombination events in heterochromatin. Synergy between the meiotic bouquet stage and satellite DNA sequence dynamics could make a positive feedback loop that promotes the observed genome-wide distribution. At the same time, considering the abundance of these DNAs in heterochromatin spanning the (peri)centromeric chromosomal segments, we speculate that diverged repetitive sequences might represent the DNA basis of reproductive barrier between the two sibling species.
- Published
- 2010
25. Retrotransposons and non-protein coding RNAs
- Author
-
Tobias Mourier and Eske Willerslev
- Subjects
Transposable element ,Genetics ,RNA, Untranslated ,Retroelements ,Transcription, Genetic ,fungi ,genetic processes ,Interspersed repeat ,information science ,food and beverages ,RNA ,Retrotransposon ,Computational biology ,Biology ,Biochemistry ,Genome ,Long terminal repeat ,Long non-coding RNA ,Gene Expression Regulation ,microRNA ,health occupations ,Animals ,Humans ,Molecular Biology - Abstract
Retrotransposons constitute a significant fraction of mammalian genomes. Considering the finding of widespread transcriptional activity across entire genomes, it is not surprising that retrotransposons contribute to the collective RNA pool. However, the transcriptional output from retrotransposons does not merely represent spurious transcription. We review examples of functional RNAs transcribed from retrotransposons, and address the collection of non-protein coding RNAs derived from transposable element sequences, including numerous human microRNAs and the neuronal BC RNAs. Finally, we review the emerging understanding of how retrotransposons themselves are regulated by small RNAs.
- Published
- 2009
26. Changes in DNA methylation of tandem DNA repeats are different from interspersed repeats in cancer
- Author
-
Dan Douer, Scott Worswick, Talia Shear, Erika M. Wolff, Si Ho Choi, Gangning Liang, Guillermo Garcia-Manero, Allen S. Yang, Hyang-Min Byun, and John C. Soussa
- Subjects
Male ,Cancer Research ,Interspersed repeat ,Alu element ,Biology ,Polymerase Chain Reaction ,Article ,Epigenetics of physical exercise ,Leukemia, Promyelocytic, Acute ,Tandem repeat ,Alu Elements ,Leukemia, Myelogenous, Chronic, BCR-ABL Positive ,Neoplasms ,Humans ,Cancer epigenetics ,Epigenetics ,Short Interspersed Nucleotide Elements ,Genetics ,Sequence Analysis, DNA ,DNA Methylation ,Middle Aged ,Molecular biology ,Urinary Bladder Neoplasms ,Oncology ,Tandem Repeat Sequences ,DNA methylation ,Female ,Human genome - Abstract
Hypomethylation of DNA repetitive elements is a common finding in cancer, but very little is known about the DNA methylation changes of different types of DNA repetitive elements, such as interspersed repeats (LINE1 and Alu Yb8) and tandem repeats (Sat-α, NBL-2 and D4Z4). We used bisulfite-PCR Pyrosequencing to quantitatively measure the DNA methylation of 5 different DNA repetitive elements in normal tissue and cancer. In all we studied 10 different tissues from 4 individuals undergoing autopsy, 34 paired normal and tumor tissues from patients with bladder cancer, 58 patients with chronic myelogenous leukemia and 23 patients with acute promyelocytic leukemia. We found that the DNA methylation of interspersed repeats (LINE1 and Alu Yb8) was very consistent from person to person and tissue to tissue while tandem DNA repeats appeared more variable in normal tissues. In bladder cancer we found clear hypomethylation of LINE1, Alu Yb8, Sat-α, and NBL-2. Conversely, we found an increase in the DNA methylation levels of D4Z4 from normal to cancer. In contrast leukemia showed no significant changes in the DNA methylation of LINE1 and Alu Yb8, but DNA methylation increases in NBL-2 and D4Z4 tandem repeats. Our findings show that the changes in DNA methylation levels of individual DNA repetitive elements are unique for each repetitive element, which may reflect distinct epigenetic factors and may have important implications in the use of DNA methylation of repetitive elements as global DNA methylation biomarkers. Keywords: DNA methylation, DNA repetitive elements, bladder cancer, leukemia
- Published
- 2009
27. Tracking the past: Interspersed repeats in an extinct Afrotherian mammal, Mammuthus primigenius
- Author
-
Stephan C. Schuster, Fangqing Zhao, and Ji Qi
- Subjects
Time Factors ,Letter ,Retroelements ,Woolly mammoth ,Lineage (evolution) ,Interspersed repeat ,Zoology ,Retrotransposon ,Extinction, Biological ,Genome ,Evolution, Molecular ,Genetics ,Animals ,Humans ,Computer Simulation ,Genome size ,Phylogeny ,Genetics (clinical) ,Mammoth ,Mammals ,Models, Genetic ,biology ,Chromosome Mapping ,Paleontology ,Genomics ,Opossums ,Interspersed Repetitive Sequences ,biology.organism_classification ,Chromosomes, Mammalian ,DNA Transposable Elements - Abstract
The woolly mammoth (Mammuthus primigenius) died out about several thousand years ago, yet recent paleogenomic studies have successfully recovered genetic information from both the mitochondrial and nuclear genomes of this extinct species. Mammoths belong to Afrotheria, a group of mammals exhibiting extreme morphological diversity and large genome sizes. In this study, we found that the mammoth genome contains a larger proportion of interspersed repeats than any other mammalian genome reported so far, in which the proliferation of the RTE family of retrotransposons (covering 12% of the genome) may be the main reason for an increased genome size. Phylogenetic analysis showed that RTEs in mammoth are closely related to the family BovB/RTE. The incongruence of the reconstructed RTE phylogeny indicates that RTEs in mammoth may be acquired through an ancient lateral gene transfer event. A recent proliferation of SINEs was also found in the probocidean lineage, whereas the Afrotherian-wide SINEs in mammoth have undergone a rather flat and stepwise expansion. Comparisons of the transposable elements (TEs) between mammoth and other mammals may shed light on the evolutionary history of TEs in various mammalian lineages.
- Published
- 2009
28. Characterization of rabbit DNA micros extracted from the EMBL nucleotide sequence database
- Author
-
L.F.M. van Zutphen and H.A. van Lith
- Subjects
Databases, Factual ,Swine ,Interspersed repeat ,Biology ,Genome ,DNA sequencing ,Species Specificity ,Trinucleotide Repeats ,Genetics ,Animals ,Humans ,Coding region ,Dinucleotide Repeats ,Gene ,Polymorphism, Genetic ,Base Sequence ,Nucleic acid sequence ,DNA ,General Medicine ,Molecular biology ,Rats ,Genetic marker ,Microsatellite ,Animal Science and Zoology ,Rabbits ,Chickens ,Microsatellite Repeats - Abstract
Microsatellite polymorphisms are invaluable for mapping vertebrate genomes. In order to estimate the occurrence of microsatellites in the rabbit genome and to assess their feasibility as markers in rabbit genetics, a survey on the presence of all types of mononucleotide, dinucleotide, trinucleotide and tetranucleotide repeats, with a length of about 20 bp or more, was conducted by searching the published rabbit DNA sequences in the EMBL nucleotide database (version 323). A total of 181 rabbit microsatellites could be extracted from the present database. The estimated frequency of microsatellites in the rabbit genome was one microsatellite for every 2-3 kb of DNA. Dinucleotide repeats constituted the prevailing class of microsatellites, followed by trinucleotide, mononucleotide and tetranucleotide repeats, respectively. The average length of the microsatellites, as found in the database, was 26, 23, 23 and 22 bp for mono-, di-, tri- and tetranucleotide repeats, respectively. The most common repeat motif was AG, followed by A, AC, AGG and CCG. This group comprised about 70% of all extracted rabbit microsatellites. About 61% of the microsatellites were found in non-coding regions of genes, whereas 15% resided in (protein) coding regions. A significant fraction of rabbit microsatellites (about 22%) was found within interspersed repetitive DNA sequences.
- Published
- 2009
29. Large-scale analysis of exonized mammalian-wide interspersed repeats in primate genomes
- Author
-
Yi Xing, Shihao Shen, Peng Jiang, Beverly L. Davidson, Lan Lin, and Seiko Sato
- Subjects
Primates ,RNA Splicing ,Interspersed repeat ,Alu element ,Retrotransposon ,Biology ,Evolution, Molecular ,Exon ,Genetics ,Animals ,Humans ,Molecular Biology ,Genetics (clinical) ,Mammals ,Genome ,food and beverages ,Exons ,Articles ,General Medicine ,Interspersed Repetitive Sequences ,Evolutionary biology ,RNA splicing ,DNA Transposable Elements ,Human genome ,Tandem exon duplication - Abstract
Transposable elements (TEs) are major sources of new exons in higher eukaryotes. Almost half of the human genome is derived from TEs, and many types of TEs have the potential to exonize. In this work, we conducted a large-scale analysis of human exons derived from mammalian-wide interspersed repeats (MIRs), a class of old TEs which was active prior to the radiation of placental mammals. Using exon array data of 328 MIR-derived exons and RT–PCR analysis of 39 exons in 10 tissues, we identified 15 constitutively spliced MIR exons, and 15 MIR exons with tissue-specific shift in splicing patterns. Analysis of RNAs from multiple species suggests that the splicing events of many strongly included MIR exons have been established before the divergence of primates and rodents, while a small percentage result from recent exonization during primate evolution. Interestingly, exon array data suggest substantially higher splicing activities of MIR exons when compared with exons derived from Alu elements, a class of primate-specific retrotransposons. This appears to be a universal difference between exons derived from young and old TEs, as it is also observed when comparing Alu exons to exons derived from LINE1 and LINE2, two other groups of old TEs. Together, this study significantly expands current knowledge about exonization of TEs. Our data imply that with sufficient evolutionary time, numerous new exons could evolve beyond the evolutionary intermediate state and contribute functional novelties to modern mammalian genomes.
- Published
- 2009
30. Microhomologies and interspersed repeat elements at genomic breakpoints in chronic myeloid leukemia
- Author
-
Achille Venco, Alessandro Rambaldi, Leonardo Campiotti, E Mattarucchi, Vittoria Guerini, Giovanni Porta, Francesco Lo Curto, and Francesco Pasquali
- Subjects
Cancer Research ,Sequence analysis ,Chromosomes, Human, Pair 22 ,Molecular Sequence Data ,Interspersed repeat ,Fusion Proteins, bcr-abl ,Alu element ,Chromosomal translocation ,Biology ,Translocation, Genetic ,Cohort Studies ,chemistry.chemical_compound ,Leukemia, Myelogenous, Chronic, BCR-ABL Positive ,Sequence Homology, Nucleic Acid ,Genetics ,Humans ,Recombination, Genetic ,Base Sequence ,Breakpoint ,Myeloid leukemia ,Chromosome Breakage ,Interspersed Repetitive Sequences ,Non-homologous end joining ,chemistry ,Chromosomes, Human, Pair 9 ,DNA - Abstract
Reciprocal translocation t(9;22) is central to the pathogenesis of chronic myeloid leukemia. Some authors have suggested that Alu repeats facilitate this process, but supporting analyses have been sparse and often anecdotal. The purpose of this study was to analyze the local structure of t(9;22) translocations and assess the relevance of interspersed repeat elements at breakpoints. Collected data have been further compared with the current models of DNA recombination, in particular the single-strand annealing (SSA) and the nonhomologous end joining (NHEJ) processes. We developed a protocol for the rapid characterization of patient-specific genomic junctions and analyzed 27 patients diagnosed with chronic myeloid leukemia. Sequence analysis revealed microhomologies at the junctions of 21 patients of 27, while interspersed repeats were of relevance (P < 0.05) in at least 16 patients. These findings are more frequent than expected and give an indication that the main mechanisms involved in the t(9;22) translocation are the SSA and NHEJ pathways, both playing a role. Furthermore, our report is consistent with microhomologies facilitating the joining of DNA ends in the translocation process, and with both Alu and a variety of other repeat sequences pairing nonhomologous chromosomes during the SSA pathway. V C 2008 Wiley-Liss, Inc.
- Published
- 2008
31. Genome-wide tracking of unmethylated DNA Alu repeats in normal and cancer cells
- Author
-
Cristina Bernadó Morales, Mar Muñoz, Miguel A. Peinado, Jairo Rodriguez, Mireia Jordà, Elisenda Vendrell, and Laura Vives
- Subjects
Colon ,Interspersed repeat ,Alu element ,Biology ,Polymerase Chain Reaction ,Genome ,Epigenesis, Genetic ,Alu Elements ,Cell Line, Tumor ,Genetics ,Chromosomes, Human ,Humans ,Intestinal Mucosa ,Càncer ,Gene ,Cancer ,Genomic organization ,Genome, Human ,Carcinoma ,Computational Biology ,Genomics ,DNA Methylation ,Genòmica ,CpG site ,DNA methylation ,CpG Islands ,Human genome ,Colorectal Neoplasms - Abstract
Methylation of the cytosine is the most frequent epigenetic modification of DNA in mammalian cells. In humans, most of the methylated cytosines are found in CpG-rich sequences within tandem and interspersed repeats that make up to 45% of the human genome, being Alu repeats the most common family. Demethylation of Alu elements occurs in aging and cancer processes and has been associated with gene reactivation and genomic instability. By targeting the unmethylated SmaI site within the Alu sequence as a surrogate marker, we have quantified and identified unmethylated Alu elements on the genomic scale. Normal colon epithelial cells contain in average 25 486 +/- 10 157 unmethylated Alu's per haploid genome, while in tumor cells this figure is 41 995 +/- 17 187 (P = 0.004). There is an inverse relationship in Alu families with respect to their age and methylation status: the youngest elements exhibit the highest prevalence of the SmaI site (AluY: 42%; AluS: 18%, AluJ: 5%) but the lower rates of unmethylation (AluY: 1.65%; AluS: 3.1%, AluJ: 12%). Data are consistent with a stronger silencing pressure on the youngest repetitive elements, which are closer to genes. Further insights into the functional implications of atypical unmethylation states in Alu elements will surely contribute to decipher genomic organization and gene regulation in complex organisms.
- Published
- 2007
32. The Influence of LINE-1 and SINE Retrotransposons on Mammalian Genomes
- Author
-
Jose L. Garcia-Perez, John B. Moldovan, Aurélien J. Doucet, Sandra R. Richardson, Huira C. Kopera, and John V. Moran
- Subjects
Microbiology (medical) ,Physiology ,Interspersed repeat ,Alu element ,Retrotransposon ,Biology ,Genome ,Article ,03 medical and health sciences ,0302 clinical medicine ,Genetics ,Short Interspersed Nucleotide Elements ,Animals ,Humans ,030304 developmental biology ,Mammals ,Recombination, Genetic ,0303 health sciences ,General Immunology and Microbiology ,Ecology ,Genetic Diseases, Inborn ,Genetic Variation ,Cell Biology ,Long terminal repeat ,Long interspersed nuclear element ,Infectious Diseases ,Long Interspersed Nucleotide Elements ,Evolutionary biology ,Human genome ,030217 neurology & neurosurgery - Abstract
Transposable elements have had a profound impact on the structure and function of mammalian genomes. The retrotransposon Long INterspersed Element-1 (LINE-1 or L1), by virtue of its replicative mobilization mechanism, comprises ∼17% of the human genome. Although the vast majority of human LINE-1 sequences are inactive molecular fossils, an estimated 80–100 copies per individual retain the ability to mobilize by a process termed retrotransposition. Indeed, LINE-1 is the only active, autonomous retrotransposon in humans and its retrotransposition continues to generate both intra-individual and inter-individual genetic diversity. Here, we briefly review the types of transposable elements that reside in mammalian genomes. We will focus our discussion on LINE-1 retrotransposons and the non-autonomous Short INterspersed Elements (SINEs) that rely on the proteins encoded by LINE-1 for their mobilization. We review cases where LINE-1-mediated retrotransposition events have resulted in genetic disease and discuss how the characterization of these mutagenic insertions led to the identification of retrotransposition-competent LINE-1s in the human and mouse genomes. We then discuss how the integration of molecular genetic, biochemical, and modern genomic technologies have yielded insight into the mechanism of LINE-1 retrotransposition, the impact of LINE-1-mediated retrotransposition events on mammalian genomes, and the host cellular mechanisms that protect the genome from unabated LINE-1-mediated retrotransposition events. Throughout this review, we highlight unanswered questions in LINE-1 biology that provide exciting opportunities for future research. Clearly, much has been learned about LINE-1 and SINE biology since the publication of Mobile DNA II thirteen years ago. Future studies should continue to yield exciting discoveries about how these retrotransposons contribute to genetic diversity in mammalian genomes.
- Published
- 2015
33. CGGBP1 mitigates cytosine methylation at repetitive DNA sequences
- Author
-
Bengt Westermark, Prasoon Agarwal, Paul Collier, Vladimir Benes, Markus Hsi-Yang Fritz, Umashankar Singh, and Helena Jernberg Wiklund
- Subjects
Retroelements ,Bisulfite sequencing ,Interspersed repeat ,Alu element ,Biology ,Cell Line ,Cytosine ,Epigenetics of physical exercise ,Alu Elements ,Genetics ,Humans ,RNA-Directed DNA Methylation ,Repetitive Sequences, Nucleic Acid ,Medicinsk genetik ,High-Throughput Nucleotide Sequencing ,Methylation ,DNA ,Sequence Analysis, DNA ,DNA Methylation ,DNA-Binding Proteins ,CpG site ,DNA methylation ,CpG Islands ,Medical Genetics ,Biotechnology ,Research Article - Abstract
Background CGGBP1 is a repetitive DNA-binding transcription regulator with target sites at CpG-rich sequences such as CGG repeats and Alu-SINEs and L1-LINEs. The role of CGGBP1 as a possible mediator of CpG methylation however remains unknown. At CpG-rich sequences cytosine methylation is a major mechanism of transcriptional repression. Concordantly, gene-rich regions typically carry lower levels of CpG methylation than the repetitive elements. It is well known that at interspersed repeats Alu-SINEs and L1-LINEs high levels of CpG methylation constitute a transcriptional silencing and retrotransposon inactivating mechanism. Results Here, we have studied genome-wide CpG methylation with or without CGGBP1-depletion. By high throughput sequencing of bisulfite-treated genomic DNA we have identified CGGBP1 to be a negative regulator of CpG methylation at repetitive DNA sequences. In addition, we have studied CpG methylation alterations on Alu and L1 retrotransposons in CGGBP1-depleted cells using a novel bisulfite-treatment and high throughput sequencing approach. Conclusions The results clearly show that CGGBP1 is a possible bidirectional regulator of CpG methylation at Alus, and acts as a repressor of methylation at L1 retrotransposons. Electronic supplementary material The online version of this article (doi:10.1186/s12864-015-1593-2) contains supplementary material, which is available to authorized users.
- Published
- 2015
34. Human chromosomal bands: nested structure, high-definition map and molecular basis
- Author
-
Giorgio Bernardi, Concetta Federico, Oliver Clay, Salvatore Saccone, Maria Costantini, and Fabio Auletta
- Subjects
Genetics ,Base Composition ,Interspersed repeat ,Resolution (electron density) ,Intron ,Chromosome Mapping ,Biology ,DNA sequencing ,Chromosome Banding ,chemistry.chemical_compound ,CpG site ,chemistry ,Evolutionary biology ,Terminology as Topic ,Cytogenetic Analysis ,Chromosomes, Human ,Humans ,Human genome ,Isochores ,Genetics (clinical) ,DNA ,Cytosine - Abstract
In this paper, we report investigations on the nested structure, the high-definition mapping, and the molecular basis of the classical Giemsa and Reverse bands in human chromosomes. We found the rules according to which the approximately 3,200 isochores of the human genome are assembled in high (850-band) resolution bands, and the latter in low (400-band) resolution bands, so forming the nested mosaic structure of chromosomes. Moreover, we identified the borders of both sets of chromosomal bands at the DNA sequence level on the basis of our recent map of isochores, which represent the highest-resolution, ultimate bands. Indeed, beyond the 100-kb resolution of the isochore map, the guanine and cytosine (GC) profile of DNA becomes turbulent owing to the contribution of specific sequences such as exons, introns, interspersed repeats, CpG islands, etc. The isochore-based level of definition (100 kb) of chromosomal bands is much higher than the cytogenetic definition level (2-3 Mb). The major conclusions of this work concern the high degree of order found in the structure of chromosomal bands, their mapping at a high definition, and the solution of the long-standing problem of the molecular basis of chromosomal bands, as these could be defined on the basis of compositional DNA properties alone.
- Published
- 2006
35. Cellular inhibitors of long interspersed element 1 and Alu retrotransposition
- Author
-
K. Sue O'Shea, Heather L. Wiegand, John V. Moran, Amy E. Hulme, Bryan R. Cullen, Hal P. Bogerd, and Jose L. Garcia-Perez
- Subjects
Protein family ,Interspersed repeat ,Alu element ,Retrotransposon ,Biology ,medicine.disease_cause ,Cytosine Deaminase ,Alu Elements ,Cytidine Deaminase ,medicine ,Humans ,APOBEC Deaminases ,RNA, Messenger ,APOBEC3A ,Genetics ,Mutation ,Multidisciplinary ,Gene Expression Regulation, Developmental ,food and beverages ,Biological Sciences ,Long interspersed nuclear element ,Protein Transport ,Long Interspersed Nucleotide Elements ,DNA Transposable Elements ,Human genome ,HeLa Cells - Abstract
Long interspersed element (LINE) 1 retrotransposons are major genomic parasites that represent approximately 17% of the human genome. The LINE-1 ORF2 protein is also responsible for the mobility of Alu elements, which constitute a further approximately 11% of genomic DNA. Representative members of each element class remain mobile, and deleterious retrotransposition events can induce spontaneous genetic diseases. Here, we demonstrate that APOBEC3A and APOBEC3B, two members of the APOBEC3 family of human innate antiretroviral resistance factors, can enter the nucleus, where LINE-1 and Alu reverse transcription occurs, and specifically inhibit both LINE-1 and Alu retrotransposition. These data suggest that the APOBEC3 protein family may have evolved, at least in part, to defend the integrity of the human genome against endogenous retrotransposons.
- Published
- 2006
36. Strong Regional Biases in Nucleotide Substitution in the Chicken Genome
- Author
-
Hans Ellegren, Matthew T. Webster, and Erik Axelsson
- Subjects
Turkeys ,Interspersed repeat ,Gene Conversion ,Alu element ,Coturnix ,Biology ,Genome ,Evolution, Molecular ,Alu Elements ,Molecular evolution ,Genetics ,Animals ,Humans ,Point Mutation ,Gene conversion ,Molecular Biology ,Ecology, Evolution, Behavior and Systematics ,Base Composition ,Likelihood Functions ,Introns ,Fixation (population genetics) ,Long Interspersed Nucleotide Elements ,Microchromosome ,CpG Islands ,Chickens ,GC-content - Abstract
Interspersed repeats have emerged as a valuable tool for studying neutral patterns of molecular evolution. Here we analyze variation in the rate and pattern of nucleotide substitution across all autosomes in the chicken genome by comparing the present-day CR1 repeat sequences with their ancestral copies and reconstructing nucleotide substitutions with a maximum likelihood model. The results shed light on the origin and evolution of large-scale heterogeneity in GC content found in the genomes of birds and mammals--the isochore structure. In contrast to mammals, where GC content is becoming homogenized, heterogeneity in GC content is being reinforced in the chicken genome. This is also supported by patterns of substitution inferred from alignments of introns in chicken, turkey, and quail. Analysis of individual substitution frequencies is consistent with the biased gene conversion (BGC) model of isochore evolution, and it is likely that patterns of evolution in the chicken genome closely resemble those in the ancestral amniote genome, when it is inferred that isochores originated. Microchromosomes and distal regions of macrochromosomes are found to have elevated substitution rates and a more GC-biased pattern of nucleotide substitution. This can largely be accounted for by a strong correlation between GC content and the rate and pattern of substitution. The results suggest that an interaction between increased mutability at CpG motifs and fixation biases due to BGC could explain increased levels of divergence in GC-rich regions.
- Published
- 2006
37. Characterization and distribution of repetitive elements in association with genes in the human genome
- Author
-
Shaw Jenq Tsai, Kai-Chiang Liang, Joseph T. Tseng, and H. Sunny Sun
- Subjects
Regulation of gene expression ,Genetics ,Genome evolution ,Genome, Human ,Organic Chemistry ,Interspersed repeat ,Computational biology ,Biology ,ENCODE ,Biochemistry ,Trans-regulatory element ,Genome ,Computational Mathematics ,Structural Biology ,Humans ,Human genome ,Gene ,Repetitive Sequences, Nucleic Acid - Abstract
Repetitive elements constitute more than 50% of the human genome. Recent studies implied that the complexity of living organisms is not just a direct outcome of a number of coding sequences; the repetitive elements, which do not encode proteins, may also play a significant role. Though scattered studies showed that repetitive elements in the regulatory regions of a gene control gene expression, no systematic survey has been done to report the characterization and distribution of various types of these repetitive elements in the human genome. Sequences from 5' and 3' untranslated regions and upstream and downstream of a gene were downloaded from the Ensembl database. The repetitive elements in the neighboring of each gene were identified and classified using cross-matching implemented in the RepeatMasker. The annotation and distribution of distinct classes of repetitive elements associated with individual gene were collected to characterize genes in association with different types of repetitive elements using systems biology program. We identified a total of 1,068,400 repetitive elements which belong to 37-class families and 1235 subclasses that are associated with 33,761 genes and 57,365 transcripts. In addition, we found that the tandem repeats preferentially locate proximal to the transcription start site (TSS) of genes and the major function of these genes are involved in developmental processes. On the other hand, interspersed repetitive elements showed a tendency to be accumulated at distal region from the TSS and the function of interspersed repeat-containing genes took part in the catabolic/metabolic processes. Results from the distribution analysis were collected and used to construct a gene-based repetitive element database (GBRED; http://www.binfo.ncku.edu.tw/GBRED/index.html). A user-friendly web interface was designed to provide the information of repetitive elements associated with any particular gene(s). This is the first study focusing on the gene-associated repetitive elements in the human genome. Our data showed distinct genes associated with different kinds of repetitive element and implied such combination may shape the function of these genes. Aside from the conventional view of these elements in genome evolution, results from this study offer a systemic review to facilitate exploitation of these elements in genome function.
- Published
- 2014
38. Design optimization methods for genomic DNA tiling arrays
- Author
-
John E. Karro, Joel Rozowsky, Michael Snyder, Olof Emanuelsson, Valery Trifonov, Mark Gerstein, Paul Bertone, Ming-Yang Kao, and Falk Schubert
- Subjects
animal structures ,genetic structures ,genetic processes ,Interspersed repeat ,Biology ,Sensitivity and Specificity ,Subsequence ,Methods ,Genetics ,Animals ,Humans ,natural sciences ,Time complexity ,Genetics (clinical) ,Oligonucleotide Array Sequence Analysis ,Sequence ,Tiling array ,Genome, Human ,Gene Expression Profiling ,Reproducibility of Results ,Interspersed Repetitive Sequences ,Path (graph theory) ,Human genome ,sense organs ,Algorithm ,Algorithms - Abstract
A recent development in microarray research entails the unbiased coverage, or tiling, of genomic DNA for the large-scale identification of transcribed sequences and regulatory elements. A central issue in designing tiling arrays is that of arriving at a single-copy tile path, as significant sequence cross-hybridization can result from the presence of non-unique probes on the array. Due to the fragmentation of genomic DNA caused by the widespread distribution of repetitive elements, the problem of obtaining adequate sequence coverage increases with the sizes of subsequence tiles that are to be included in the design. This becomes increasingly problematic when considering complex eukaryotic genomes that contain many thousands of interspersed repeats. The general problem of sequence tiling can be framed as finding an optimal partitioning of non-repetitive subsequences over a prescribed range of tile sizes, on a DNA sequence comprising repetitive and non-repetitive regions. Exact solutions to the tiling problem become computationally infeasible when applied to large genomes, but successive optimizations are developed that allow their practical implementation. These include an efficient method for determining the degree of similarity of many oligonucleotide sequences over large genomes, and two algorithms for finding an optimal tile path composed of longer sequence tiles. The first algorithm, a dynamic programming approach, finds an optimal tiling in linear time and space; the second applies a heuristic search to reduce the space complexity to a constant requirement. A Web resource has also been developed, accessible at http://tiling.gersteinlab.org, to generate optimal tile paths from user-provided DNA sequences.
- Published
- 2005
39. Distribution and characterization of staphylococcal interspersed repeat units (SIRUs) and potential use for strain differentiation
- Author
-
Katherine J. Hardy, Peter M. Hawkey, David W. Ussery, and Beryl Oppenheim
- Subjects
Genetics ,Staphylococcus aureus ,Base Sequence ,Molecular Sequence Data ,Interspersed repeat ,Population genetics ,Minisatellite Repeats ,Staphylococcal Infections ,Biology ,Microbiology ,Genome ,United Kingdom ,Bacterial Typing Techniques ,Disease Outbreaks ,Evolution, Molecular ,Variable number tandem repeat ,Minisatellite ,Tandem repeat ,Humans ,Methicillin Resistance ,Typing ,Clade ,Sequence Alignment ,Genome, Bacterial - Abstract
Variable-number tandem repeats (VNTRs) have been shown to be a powerful tool in the determination of evolutionary relationships and population genetics of bacteria. The sequencing of a number of Staphylococcus aureus genomes has allowed the identification of novel VNTR sequences in S. aureus, which are similar to those used in the study of the evolution of Mycobacterium tuberculosis clades. Seven VNTRs, termed staphylococcal interspersed repeat units (SIRUs), distributed around the genome are described, occurring in both unique and multiple sites, and varying in length from 48 to 159 bp. Variations in copy numbers were observed in all loci, within both the sequenced genomes and the UK epidemic methicillin-resistant S. aureus (EMRSA) isolates. Clonally related UK EMRSA isolates were clustered using SIRUs, which provided a greater degree of discrimination than multi-locus sequence typing, indicating that VNTRs may be a more appropriate evolutionary marker for studying transmission events and the geographical spread of S. aureus clades.
- Published
- 2004
40. Evolutionary implications of pericentromeric gene expression in humans
- Author
-
Jonathan M. Mudge and Michael S. Jackson
- Subjects
Male ,Transcription, Genetic ,Sequence analysis ,In silico ,Centromere ,Interspersed repeat ,DNA, Satellite ,Biology ,Evolution, Molecular ,Testicular Neoplasms ,Genes, Duplicate ,Gene Duplication ,Testis ,Gene duplication ,Genetics ,Chromosomes, Human ,Humans ,Molecular Biology ,Gene ,Genetics (clinical) ,Expressed Sequence Tags ,Regulation of gene expression ,Expressed sequence tag ,Chimera ,Gene Expression Profiling ,Chromosome Mapping ,Chromosome ,Sequence Analysis, DNA ,Gene Expression Regulation, Neoplastic ,Gene Expression Regulation - Abstract
Human pericentromeric sequences are enriched for recent sequence duplications. The continual creation and shuffling of these duplications can create novel intron-exon structures and it has been suggested that these regions have a function as gene nurseries. However, these sequences are also rich in satellite repeats which can repress transcription, and analyses of chromosomes 10 and 21 have suggested that they are transcript poor. Here, we investigate the relationship between pericentromeric duplication and transcription by analyzing the in silico transcriptional profiles within the proximal 1.5 Mb of genomic sequence on all human chromosome arms in relation to duplication status. We identify an approximately 5x excess of transcripts specific to cancer and/or testis in pericentromeric duplications compared to surrounding single copy sequence, with the expression of >50% of all transcripts in duplications being restricted to these tissues. We also identify an approximately 5x excess of transcripts in duplications which contain large quantities of interspersed repeats. These results indicate that the transcriptional profiles of duplicated and single copy sequences within pericentromeric DNA are distinct, suggesting that pericentromeric instability is unlikely to represent a common route for gene creation but may have a disproportionate effect upon genes whose function is restricted to the germ line.
- Published
- 2004
41. From masking repeats to identifying functional repeats in the mouse transcriptome
- Author
-
Christian Schönbach
- Subjects
Genetics ,Transposable element ,Whole genome sequencing ,Genome ,Transcription, Genetic ,Gene Expression Profiling ,MEDLINE ,Interspersed repeat ,Genomics ,Context (language use) ,Biology ,Mice ,Databases, Genetic ,Animals ,Humans ,Direct repeat ,Human genome ,RNA Processing, Post-Transcriptional ,Molecular Biology ,Software ,Repetitive Sequences, Nucleic Acid ,Information Systems - Abstract
The back-to-back release of the mouse genome and the functionally annotated RIKEN mouse full-length cDNA collection was an important milestone in mammalian genomics. Yet much of the data remain to be explored in terms of biological effects and mechanisms. For example, interspersed repeats account for 39 per cent of the mouse genome sequence and 11 per cent of representative transcripts. A considerable number of transposable repeat elements are still active and propagating in mouse compared with human. While existing repeat databases and tools assist the classification of repeats or identification of new repeats, there is little bioinformatic support towards exploring the extent and role of repeats in transcriptional variation, modulation of protein function, or gene regulatory events. Since the mouse is used as a model organism to study human genes and their disease associations, this review focuses on information extraction and collation that captures the functional context of repeats in mouse transcripts to facilitate the biological interpretation and extrapolation of findings to the human.
- Published
- 2004
42. Analysis of the Gene-Dense Major Histocompatibility Complex Class III Region and Its Comparison to Mouse
- Author
-
Shizhen Qin, Tao Xie, Begoña Aguado, R. Duncan Campbell, Mary Ellen Ahearn, Lee Rowen, Leroy Hood, and Anup Madan
- Subjects
RNA, Untranslated ,Molecular Sequence Data ,Interspersed repeat ,Biology ,Genome ,Conserved sequence ,Conserved non-coding sequence ,Evolution, Molecular ,Major Histocompatibility Complex ,Mice ,Intergenic region ,Genetics ,Animals ,Humans ,Letters ,Gene ,Conserved Sequence ,Genetics (clinical) ,Polymorphism, Genetic ,Intron ,Proteins ,Sequence Analysis, DNA ,Alternative Splicing ,Genes ,Protein Biosynthesis ,Human genome - Abstract
In mammals, the Major Histocompatibility Complex class I and II gene clusters are separated by an approximately 700-kb stretch of sequence called the MHC class III region, which has been associated with susceptibility to numerous diseases. To facilitate understanding of this medically important and architecturally interesting portion of the genome, we have sequenced and analyzed both the human and mouse class III regions. The cross-species comparison has facilitated the identification of 60 genes in human and 61 in mouse, including a potential RNA gene for which the introns are more conserved across species than the exons. Delineation of global organization, gene structure, alternative splice forms, protein similarities, and potential cis-regulatory elements leads to several conclusions: (1) The human MHC class III region is the most gene-dense region of the human genome:14% of the sequence is coding, approximately 72% of the region is transcribed, and there is an average of 8.5 genes per 100 kb. (2) Gene sizes, number of exons, and intergenic distances are for the most part similar in both species, implying that interspersed repeats have had little impact in disrupting the tight organization of this densely packed set of genes. (3) The region contains a heterogeneous mixture of genes, only a few of which have a clearly defined and proven function. Although many of the genes are of ancient origin, some appear to exist only in mammals and fish, implying they might be specific to vertebrates. (4) Conserved noncoding sequences are found primarily in or near the 5'-UTR or the first intron of genes, and seldom in the intergenic regions. Many of these conserved blocks are likely to be cis-regulatory elements.
- Published
- 2003
43. Structural Dynamics of Eukaryotic Chromosome Evolution
- Author
-
Evan E. Eichler and David Sankoff
- Subjects
Genome evolution ,Centromere ,Interspersed repeat ,Chromosomal rearrangement ,Biology ,Synteny ,Genome ,Chromosomes ,Molecular evolution ,Gene Duplication ,Chromosome instability ,Animals ,Humans ,Chromosome Aberrations ,Recombination, Genetic ,Genetics ,Multidisciplinary ,Computational Biology ,Chromosome ,Telomere ,Biological Evolution ,Eukaryotic Cells ,Evolutionary biology ,Eukaryotic chromosome fine structure ,DNA Transposable Elements - Abstract
Large-scale genome sequencing is providing a comprehensive view of the complex evolutionary forces that have shaped the structure of eukaryotic chromosomes. Comparative sequence analyses reveal patterns of apparently random rearrangement interspersed with regions of extraordinarily rapid, localized genome evolution. Numerous subtle rearrangements near centromeres, telomeres, duplications, and interspersed repeats suggest hotspots for eukaryotic chromosome evolution. This localized chromosomal instability may play a role in rapidly evolving lineage-specific gene families and in fostering large-scale changes in gene order. Computational algorithms that take into account these dynamic forces along with traditional models of chromosomal rearrangement show promise for reconstructing the natural history of eukaryotic chromosomes.
- Published
- 2003
44. A novel approach for identifying candidate imprinted genes through sequence analysis of imprinted and control genes
- Author
-
Xiayi Ke, David O. Robinson, Andrew Collins, and Simon N. Thomas
- Subjects
Male ,Genetics ,Candidate gene ,Sequence analysis ,Interspersed repeat ,Alu element ,Biology ,Genome ,Genomic Imprinting ,Humans ,Short Interspersed Nucleotide Elements ,Female ,Repeated sequence ,Genomic imprinting ,Genetics (clinical) - Abstract
Through the sequence analysis of 27 imprinted human genes and a set of 100 control genes we have developed a novel approach for identifying candidate imprinted genes based on the differences in sequence composition observed. The imprinted genes were found to be associated with significantly reduced numbers of short interspersed transposable element (SINE) Alus and mammalian-wide interspersed repeat (MIR) repeat elements, as previously reported. In addition, a significant association between imprinted genes and increased numbers of low-complexity repeats was also evident. Numbers of the Alu classes AluJ and AluS were found to be significantly depleted in some parts of the flanking regions of imprinted genes. A recent study has proposed that there is active selection against SINE elements in imprinted regions. Alternatively, there may be differences in the rates of insertion of Alu elements. Our study indicates that this difference extends both upstream and downstream of the coding region. This and other consistent differences between the sequence characteristics of imprinted and control genes has enabled us to develop discriminant analysis, which can be used to screen the genome for candidate imprinted genes. We have applied this function to a number of genes whose imprinting status is disputed or uncertain.
- Published
- 2002
45. Novel Vertebrate Genes and Putative Regulatory Elements Identified at Kidney Disease and NR2E1/fierce Loci
- Author
-
Diana L. Palmquist, Brett S. Abrahams, Byrappa Venkatesh, Elizabeth M. Simpson, Melissa L Berry, Sydney Brenner, Alice Tay, Y. H. Tan, Grace M Mak, and Jennifer R Saionz
- Subjects
RNA, Untranslated ,Molecular Sequence Data ,Interspersed repeat ,Receptors, Cytoplasmic and Nuclear ,Sequence Homology ,Locus (genetics) ,Regulatory Sequences, Nucleic Acid ,Biology ,Takifugu ,Synteny ,Mice ,Genetics ,Animals ,Humans ,Amino Acid Sequence ,Gene ,Intron ,Sequence Analysis, DNA ,Interspersed Repetitive Sequences ,Orphan Nuclear Receptors ,biology.organism_classification ,Divergent evolution ,Alternative Splicing ,Regulatory sequence ,Kidney Diseases ,Carrier Proteins ,Sequence Alignment - Abstract
Fierce (frc) mice are deleted for nuclear receptor 2e1 (Nr2e1), and exhibit cerebral hypoplasia, blindness, and extreme aggression. To characterize the Nr2e1 locus, which may also contain the mouse kidney disease (kd) allele, we compared sequence from human, mouse, and the puffer fish Fugu rubripes. We identified a novel gene, c222389, containing conserved elements in noncoding regions. We also discovered a novel vertebrate gene conserved across its length in prokaryotes and invertebrates. Based on a dramatic upregulation in lactating breast, we named this gene lactation elevated-1 (LACE1). Two separate 100-bp elements within the first NR2E1 intron were virtually identical between the three species, despite an estimated 450 million years of divergent evolution. These elements represent strong candidates for functional NR2E1 regulatory elements in vertebrates. A high degree of conservation across NR2E1 combined with a lack of interspersed repeats suggests that an array of regulatory elements embedded within the gene is required for proper gene expression.
- Published
- 2002
46. A Technique for Genome-Wide Identification of Differences in the Interspersed Repeats Integrations between Closely Related Genomes and Its Application to Detection of Human-Specific Integrations of HERV-K LTRs
- Author
-
Eugene D. Sverdlov, Gerhard Hunsmann, Ilgar Z. Mamedov, Konstantin Khodosevich, Yuri B. Lebedev, Anton Buzdin, and Tatyana V. Vinogradova
- Subjects
Genetics ,Lineage (genetic) ,Base Sequence ,Pan troglodytes ,Genome, Human ,Endogenous Retroviruses ,Molecular Sequence Data ,Interspersed repeat ,Terminal Repeat Sequences ,Endogenous retrovirus ,Interspersed Repetitive Sequences ,Biology ,Genome ,Long terminal repeat ,Species Specificity ,Suppression subtractive hybridization ,Animals ,Humans ,Human genome ,Phylogeny - Abstract
We have developed a method of targeted genomic difference analysis (TGDA) for genomewide detection of interspersed repeat integration site differences between closely related genomes. The method includes a whole-genome amplification of the flanks adjacent to target interspersed repetitive elements in both genomic DNAs under comparison, and subtractive hybridization (SH) of the selected amplicons. The potential of TGDA was demonstrated by the detection of differences in the integration sites of human endogenous retroviruses K (HERV-K) and related solitary long terminal repeats (LTRs) between the human and chimpanzee genomes. Of 55 randomly sequenced clones from a library enriched with human-specific integration (HSI) sites, 33 (60%) represented HSIs. All the human-specific (Hs) LTRs belong to two related evolutionarily young groups, suggesting simultaneous activity of two master genes in the hominid lineage. No deletion/insertion polymorphism was detected for the LTR HSIs for 25 unrelated caucasoid individuals. We also discuss the possible research applications for TGDA research.
- Published
- 2002
47. A Complete and Accurate Ab Initio Repeat Finding Algorithm
- Author
-
Zhang Xiaoli, Shuaibin Lian, Peng Wang, Xinwu Chen, and Xianhua Dai
- Subjects
0301 basic medicine ,Base pair ,Interspersed repeat ,Hash function ,Ab initio ,Health Informatics ,Biology ,General Biochemistry, Genetics and Molecular Biology ,Computer Science Applications ,03 medical and health sciences ,Variable number tandem repeat ,030104 developmental biology ,Tandem repeat ,Direct repeat ,Chromosomes, Human ,Humans ,Human genome ,Algorithm ,Base Pairing ,Algorithms ,Repetitive Sequences, Nucleic Acid - Abstract
It has become clear that repetitive sequences have played multiple roles in eukaryotic genome evolution including increasing genetic diversity through mutation, changes in gene expression and facilitating generation of novel genes. However, identification of repetitive elements can be difficult in the ab initio manner. Currently, some classical ab initio tools of finding repeats have already presented and compared. The completeness and accuracy of detecting repeats of them are little pool. To this end, we proposed a new ab initio repeat finding tool, named HashRepeatFinder, which is based on hash index and word counting. Furthermore, we assessed the performances of HashRepeatFinder with other two famous tools, such as RepeatScout and Repeatfinder, in human genome data hg19. The results indicated the following three conclusions: (1) The completeness of HashRepeatFinder is the best one among these three compared tools in almost all chromosomes, especially in chr9 (8 times of RepeatScout, 10 times of Repeatfinder); (2) in terms of detecting large repeats, HashRepeatFinder also performed best in all chromosomes, especially in chr3 (24 times of RepeatScout and 250 times of Repeatfinder) and chr19 (12 times of RepeatScout and 60 times of Repeatfinder); (3) in terms of accuracy, HashRepeatFinder can merge the abundant repeats with high accuracy.
- Published
- 2014
48. Genome analysis of a major urban malaria vector mosquito, Anopheles stephensi
- Author
-
José M. C. Ribeiro, Maria V. Sharakhova, Marta Tojo, Phillip George, Scott J. Emrich, Robert M. Waterhouse, Xiaofang Jiang, Ryan C. Kennedy, Michael A. Riehle, Bo Wang, Chioma Oringanje, Kenneth D. Vernick, Victoria L.M. Davidson, A. Brantley Hall, Kristin Michel, Ashley Peery, Anthony A. James, Gareth Maslen, Shirley Luckhart, Robert E. Settlage, Nazzy Pakpour, Aleksey Komissarov, Yumin Qi, Zhijian Tu, Daniel Lawson, Igor V. Sharakhov, Xiao Guang Chen, Karin Eiglmeier, Maria F. Unger, Michelle M. Riehle, Shrinivasrao P. Mane, Jose M. C. Tubio, Yogesh S. Shouche, Atashi Sharma, Peter Arensburger, Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory, Waterhouse, Robert, Department of Biochemistry [Blacksburg], Virginia Tech [Blacksburg], Program of Genetics, Bioinformatics, and Computational Biology [Blacksburg] (GBCB), Department of Entomology [Blacksburg], Department of Pathogen Biology, Southern Medical University [Guangzhou], Department of Genetic Medicine and Development, University of Geneva Medical School, University of Geneva Medical School-University of Geneva Medical School, Swiss Institute of Bioinformatics [Genève] (SIB), Computer Science and Artificial Intelligence Laboratory [Cambridge] (CSAIL), Massachusetts Institute of Technology (MIT), Broad Institute of MIT and Harvard (BROAD INSTITUTE), Harvard Medical School [Boston] (HMS)-Massachusetts Institute of Technology (MIT)-Massachusetts General Hospital [Boston], Theodosius Dobzhansky Center for Genome Bioinformatics, St Petersburg State University (SPbU), Institute of Cytology, Russian Academy of Sciences [Moscow] (RAS), Department of Microbiology, University of Minnesota, University of Minnesota [Twin Cities] (UMN), University of Minnesota System-University of Minnesota System-University of Minnesota [Twin Cities] (UMN), University of Minnesota System-University of Minnesota System, National Center for Cell Science, Pune University, European Bioinformatics Institute [Hinxton] (EMBL-EBI), EMBL Heidelberg, Department of Medical Microbiology and Immunology, University of California [Davis] (UC Davis), University of California (UC)-University of California (UC), Department of Biological Sciences [Pomona], California State Polytechnic University [Pomona] (CAL POLY POMONA), Division of Biology, Kansas State University, Génétique et Génomique des Insectes vecteurs, Institut Pasteur [Paris] (IP)-Centre National de la Recherche Scientifique (CNRS), Department of Computer Science and Engineering, University of Notre Dame [Indiana] (UND), Department of Bioengineering and Therapeutic Sciences, University of California [San Francisco] (UC San Francisco), Virginia Bioinformatics Institute, Department of Entomology, University of Arizona, Department of Physiology, School of Medicine – CIMUS, Instituto de Investigaciones Sanitarias, Universidade de Santiago de Compostela [Spain] (USC )-Universidade de Santiago de Compostela [Spain] (USC ), The Wellcome Trust Sanger Institute [Cambridge], Department of Biological Sciences [Notre Dame], Vector Molecular Biology Section, Laboratory of Malaria and Vector Research, National Institutes of Health, Departments of Microbiology & Molecular Genetics and Molecular Biology & Biochemistry, University of California [Irvine] (UC Irvine), Fralin Life Science Institute and the Virginia Experimental Station, NIH grants AI77680 and AI105575 to ZT, AI094289 and AI099528 to IVS, AI29746 to AAJ, AI095842 to KM, AI073745 toMAR, AI080799 and AI078183 to SL, and AI042361 and AI073685 to KDV., AP and IVS are supported in part by the Institute for Critical Technology and Applied Science and the NSF award 0850198., RMW is supported by Marie Curie International Outgoing Fellowship PIOF-GA-2011-303312., XC is supported by GDUPS (2009)., This work was also supported in part by NSF grant CNS-0960081, This work was also supported in part the HokieSpeed and BlueRidge supercomputers atVirginia Tech., YS thanks the Department of Biotechnology, Governmentof India for the financial support., University of California-University of California, Institut Pasteur [Paris]-Centre National de la Recherche Scientifique (CNRS), University of California [San Francisco] (UCSF), and University of California [Irvine] (UCI)
- Subjects
Urban Population ,Genome, Insect ,Retrotransposon ,Insect/genetics ,2.2 Factors relating to physical environment ,Genome ,Anopheles/genetics/metabolism ,0302 clinical medicine ,2.2 Factors relating to the physical environment ,Cluster Analysis ,ddc:576.5 ,Aetiology ,Phylogeny ,Genetics ,0303 health sciences ,Anopheles ,Chromosome Mapping ,Single Nucleotide ,Biological Sciences ,3. Good health ,Infectious Diseases ,Insect Proteins ,Infection ,Sequence Analysis ,Biotechnology ,Evolution ,Bioinformatics ,Interspersed repeat ,Biology ,Polymorphism, Single Nucleotide ,Synteny ,Chromosomes ,Evolution, Molecular ,03 medical and health sciences ,Rare Diseases ,[SDV.BBM.GTP]Life Sciences [q-bio]/Biochemistry, Molecular Biology/Genomics [q-bio.GN] ,Information and Computing Sciences ,parasitic diseases ,Animals ,Humans ,Polymorphism ,Anopheles stephensi ,030304 developmental biology ,Whole genome sequencing ,[SDV.GEN]Life Sciences [q-bio]/Genetics ,Autosome ,Research ,fungi ,Human Genome ,Molecular ,DNA ,Sequence Analysis, DNA ,biology.organism_classification ,Chromosomes, Insect ,Insect Vectors ,Malaria ,Vector-Borne Diseases ,Good Health and Well Being ,Evolutionary biology ,Malaria/transmission ,Transcriptome ,Insect Proteins/genetics/metabolism ,Insect Vectors/genetics ,Insect ,Environmental Sciences ,030217 neurology & neurosurgery - Abstract
Background: Anopheles stephensi is the key vector of malaria throughout the Indian subcontinent and Middle East and an emerging model for molecular and genetic studies of mosquito-parasite interactions. The type form of the species is responsible for the majority of urban malaria transmission across its range. Results: Here, we report the genome sequence and annotation of the Indian strain of the type form of An. stephensi. The 221 Mb genome assembly represents more than 92% of the entire genome and was produced using a combination of 454, Illumina, and PacBio sequencing. Physical mapping assigned 62% of the genome onto chromosomes, enabling chromosome-based analysis. Comparisons between An. stephensi and An. gambiae reveal that the rate of gene order reshuffling on the X chromosome was three times higher than that on the autosomes. An. stephensi has more heterochromatin in pericentric regions but less repetitive DNA in chromosome arms than An. gambiae. We also identify a number of Y-chromosome contigs and BACs. Interspersed repeats constitute 7.1% of the assembled genome while LTR retrotransposons alone comprise more than 49% of the Y contigs. RNA-seq analyses provide new insights into mosquito innate immunity, development, and sexual dimorphism. Conclusions: The genome analysis described in this manuscript provides a resource and platform for fundamental and translational research into a major urban malaria vector. Chromosome-based investigations provide unique perspectives on Anopheles chromosome evolution. RNA-seq analysis and studies of immunity genes offer new insights into mosquito biology and mosquito-parasite interactions., National Science Foundation (U.S.) (Grant CNS-0960081)
- Published
- 2014
- Full Text
- View/download PDF
49. Short interspersed transposable elements (SINEs) are excluded from imprinted regions in the human genome
- Author
-
John M. Greally
- Subjects
Transposable element ,Guanine ,Time Factors ,Retroelements ,Interspersed repeat ,Biology ,behavioral disciplines and activities ,Genome ,Evolution, Molecular ,Cytosine ,Genomic Imprinting ,Open Reading Frames ,Humans ,Short Interspersed Nucleotide Elements ,Imprinting (psychology) ,Genetics ,Multidisciplinary ,Models, Genetic ,Genome, Human ,Sequence Analysis, DNA ,Biological Sciences ,CpG site ,DNA Transposable Elements ,CpG Islands ,Human genome ,Genomic imprinting - Abstract
To test whether regions undergoing genomic imprinting have unique genomic characteristics, imprinted and nonimprinted human loci were compared for nucleotide and retroelement composition. Maternally and paternally expressed subgroups of imprinted genes were found to differ in terms of guanine and cytosine, CpG, and retroelement content, indicating a segregation into distinct genomic compartments. Imprinted regions have been normally permissive to L1 long interspersed transposable element retroposition during mammalian evolution but universally and significantly lack short interspersed transposable elements (SINEs). The primate-specific Alu SINEs, as well as the more ancient mammalian-wide interspersed repeat SINEs, are found at significantly low densities in imprinted regions. The latter paleogenomic signature indicates that the sequence characteristics of currently imprinted regions existed before the mammalian radiation. Transitions from imprinted to nonimprinted genomic regions in cis are characterized by a sharp inflection in SINE content, demonstrating that this genomic characteristic can help predict the presence and extent of regions undergoing imprinting. During primate evolution, SINE accumulation in imprinted regions occurred at a decreased rate compared with control loci. The constraint on SINE accumulation in imprinted regions may be mediated by an active selection process. This selection could be because of SINEs attracting and spreading methylation, as has been found at other loci. Methylation-induced silencing could lead to deleterious consequences at imprinted loci, where inactivation of one allele is already established, and expression is often essential for embryonic growth and survival.
- Published
- 2001
50. Association between divergence and interspersed repeats in mammalian noncoding genomic DNA
- Author
-
Laura Elnitski, Ross C. Hardison, Francesca Chiaromonte, Shan Yang, Webb Miller, and Von Bing Yap
- Subjects
Genetics ,Mutation ,Genome ,Multidisciplinary ,Retroelements ,Genome, Human ,Interspersed repeat ,DNA ,Biological Sciences ,Biology ,medicine.disease_cause ,Noncoding DNA ,Conserved non-coding sequence ,Evolution, Molecular ,Interspersed Repetitive Sequences ,Mice ,Random Allocation ,Intergenic region ,Cot analysis ,Sequence Homology, Nucleic Acid ,medicine ,Animals ,Humans ,Human genome - Abstract
The amount of noncoding genomic DNA sequence that aligns between human and mouse varies substantially in different regions of their genomes, and the amount of repetitive DNA also varies. In this report, we show that divergence in noncoding nonrepetitive DNA is strongly correlated with the amount of repetitive DNA in a region. We investigated aligned DNA in four large genomic regions with finished human sequence and almost or completely finished mouse sequence. These regions, totaling 5.89 Mb of DNA, are on different chromosomes and vary in their base composition. An analysis based on sliding windows of 10 kb shows that the fraction of aligned noncoding nonrepetitive DNA and the fraction of repetitive DNA are negatively correlated, both at the level of an entire region and locally within it. This conclusion is strongly supported by a randomization study, in which repetitive elements are removed and randomly relocated along the sequences. Thus, regions of noncoding genomic DNA that accumulated fewer point mutations since the primate–rodent divergence also suffered fewer retrotransposition events. These results indicate that some regions of the genome are more “flexible” over the time scale of mammalian evolution, being able to accommodate many point mutations and insertions, whereas other regions are more “rigid” and accumulate fewer changes. Stronger conservation is generally interpreted as indicating more extensive or more important function. The evidence presented here of correlated variation in the rates of different evolutionary processes across noncoding DNA must be considered in assessing such conservation for evidence of selection.
- Published
- 2001
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.