101 results on '"Trees-Juen Chuang"'
Search Results
2. A longer time to relapse is associated with a larger increase in differences between paired primary and recurrent IDH wild-type glioblastomas at both the transcriptomic and genomic levels
- Author
-
Wei-Min Ho, Chia-Ying Chen, Tai-Wei Chiang, and Trees-Juen Chuang
- Subjects
Glioblastomas ,Patient-matched longitudinal analysis ,Time to relapse ,Prognostic model ,Progression-free survival ,Neurology. Diseases of the nervous system ,RC346-429 - Abstract
Abstract Glioblastoma (GBM) is the most common malignant brain tumor in adults, which remains incurable and often recurs rapidly after initial therapy. While large efforts have been dedicated to uncover genomic/transcriptomic alternations associated with the recurrence of GBMs, the evolutionary trajectories of matched pairs of primary and recurrent (P-R) GBMs remain largely elusive. It remains challenging to identify genes associated with time to relapse (TTR) and construct a stable and effective prognostic model for predicting TTR of primary GBM patients. By integrating RNA-sequencing and genomic data from multiple datasets of patient-matched longitudinal GBMs of isocitrate dehydrogenase wild-type (IDH-wt), here we examined the associations of TTR with heterogeneities between paired P-R GBMs in gene expression profiles, tumor mutation burden (TMB), and microenvironment. Our results revealed a positive correlation between TTR and transcriptomic/genomic differences between paired P-R GBMs, higher percentages of non-mesenchymal-to-mesenchymal transition and mesenchymal subtype for patients with a short TTR than for those with a long TTR, a high correlation between paired P-R GBMs in gene expression profiles and TMB, and a negative correlation between the fitting level of such a paired P-R GBM correlation and TTR. According to these observations, we identified 55 TTR-associated genes and thereby constructed a seven-gene (ZSCAN10, SIGLEC14, GHRHR, TBX15, TAS2R1, CDKL1, and CD101) prognostic model for predicting TTR of primary IDH-wt GBM patients using univariate/multivariate Cox regression analyses. The risk scores estimated by the model were significantly negatively correlated with TTR in the training set and two independent testing sets. The model also segregated IDH-wt GBM patients into two groups with significantly divergent progression-free survival outcomes and showed promising performance for predicting 1-, 2-, and 3-year progression-free survival rates in all training and testing sets. Our findings provide new insights into the molecular understanding of GBM progression at recurrence and potential targets for therapeutic treatments.
- Published
- 2024
- Full Text
- View/download PDF
3. Long-term hematopoietic transfer of the anti-cancer and lifespan-extending capabilities of a genetically engineered blood system by transplantation of bone marrow mononuclear cells
- Author
-
Jing-Ping Wang, Chun-Hao Hung, Yae-Huei Liou, Ching-Chen Liu, Kun-Hai Yeh, Keh-Yang Wang, Zheng-Sheng Lai, Biswanath Chatterjee, Tzu-Chi Hsu, Tung-Liang Lee, Yu-Chiau Shyu, Pei-Wen Hsiao, Liuh-Yow Chen, Trees-Juen Chuang, Chen-Hsin Albert Yu, Nan-Shih Liao, and C-K James Shen
- Subjects
anti-cancer ,EKLF/ KLF1 ,bone marrow transplantation ,leukocytes ,aging ,Medicine ,Science ,Biology (General) ,QH301-705.5 - Abstract
A causal relationship exists among the aging process, organ decay and disfunction, and the occurrence of various diseases including cancer. A genetically engineered mouse model, termed Klf1K74R/K74R or Klf1(K74R), carrying mutation on the well-conserved sumoylation site of the hematopoietic transcription factor KLF1/EKLF has been generated that possesses extended lifespan and healthy characteristics, including cancer resistance. We show that the healthy longevity characteristics of the Klf1(K74R) mice, as exemplified by their higher anti-cancer capability, are likely gender-, age-, and genetic background-independent. Significantly, the anti-cancer capability, in particular that against melanoma as well as hepatocellular carcinoma, and lifespan-extending property of Klf1(K74R) mice, could be transferred to wild-type mice via transplantation of their bone marrow mononuclear cells at a young age of the latter. Furthermore, NK(K74R) cells carry higher in vitro cancer cell-killing ability than wild-type NK cells. Targeted/global gene expression profiling analysis has identified changes in the expression of specific proteins, including the immune checkpoint factors PDCD and CD274, and cellular pathways in the leukocytes of the Klf1(K74R) that are in the directions of anti-cancer and/or anti-aging. This study demonstrates the feasibility of developing a transferable hematopoietic/blood system for long-term anti-cancer and, potentially, for anti-aging.
- Published
- 2024
- Full Text
- View/download PDF
4. CircMiMi: a stand-alone software for constructing circular RNA-microRNA-mRNA interactions across species
- Author
-
Tai-Wei Chiang, Te-Lun Mai, and Trees-Juen Chuang
- Subjects
Circular RNA ,microRNA ,Regulatory interaction ,Alignment ambiguity ,Reverse complementary sequence ,Autism spectrum disorder ,Computer applications to medicine. Medical informatics ,R858-859.7 ,Biology (General) ,QH301-705.5 - Abstract
Abstract Background Circular RNAs (circRNAs) are a class of non-coding RNAs formed by pre-mRNA back-splicing, which are widely expressed in animal/plant cells and often play an important role in regulating microRNA (miRNA) activities. While numerous databases have collected a large amount of predicted circRNA candidates and provided the corresponding circRNA-regulated interactions, a stand-alone package for constructing circRNA-miRNA-mRNA interactions based on user-identified circRNAs across species is lacking. Results We present CircMiMi (circRNA-miRNA-mRNA interactions), a modular, Python-based software to identify circRNA-miRNA-mRNA interactions across 18 species (including 16 animals and 2 plants) with the given coordinates of circRNA junctions. The CircMiMi-constructed circRNA-miRNA-mRNA interactions are derived from circRNA-miRNA and miRNA-mRNA axes with the support of computational predictions and/or experimental data. CircMiMi also allows users to examine alignment ambiguity of back-splice junctions for checking circRNA reliability and examine reverse complementary sequences residing in the sequences flanking the circularized exons for investigating circRNA formation. We further employ CircMiMi to identify circRNA-miRNA-mRNA interactions based on the circRNAs collected in NeuroCirc, a large-scale database of circRNAs in the human brain. We construct circRNA-miRNA-mRNA interactions comprising differentially expressed circRNAs, and miRNAs in autism spectrum disorder (ASD) and cross-species analyze the relevance of the targets to ASD. We thus provide a rich set of ASD-associated circRNA-miRNA-mRNA axes and a useful starting point for investigation of regulatory mechanisms in ASD pathophysiology. Conclusions CircMiMi allows users to identify circRNA-mediated interactions in multiple species, shedding light on regulatory roles of circRNAs. The software package and web interface are freely available at https://github.com/TreesLab/CircMiMi and http://circmimi.genomics.sinica.edu.tw/ , respectively.
- Published
- 2022
- Full Text
- View/download PDF
5. Transcriptomopathies of pre- and post-symptomatic frontotemporal dementia-like mice with TDP-43 depletion in forebrain neurons
- Author
-
Lien-Szu Wu, Wei-Cheng Cheng, Chia-Ying Chen, Ming-Che Wu, Yi-Chi Wang, Yu-Hsiang Tseng, Trees-Juen Chuang, and C.-K. James Shen
- Subjects
Circular RNAs/ frontotemporal lobar degeneration/ loss-of-function/ Mis-processing/TDP-43 ,Neurology. Diseases of the nervous system ,RC346-429 - Abstract
Abstract TAR DNA-binding protein (TDP-43) is a ubiquitously expressed nuclear protein, which participates in a number of cellular processes and has been identified as the major pathological factor in amyotrophic lateral sclerosis (ALS) and frontotemporal lobar degeneration (FTLD). Here we constructed a conditional TDP-43 mouse with depletion of TDP-43 in the mouse forebrain and find that the mice exhibit a whole spectrum of age-dependent frontotemporal dementia-like behaviour abnormalities including perturbation of social behaviour, development of dementia-like behaviour, changes of activities of daily living, and memory loss at a later stage of life. These variations are accompanied with inflammation, neurodegeneration, and abnormal synaptic plasticity of the mouse CA1 neurons. Importantly, analysis of the cortical RNA transcripts of the conditional knockout mice at the pre−/post-symptomatic stages and the corresponding wild type mice reveals age-dependent alterations in the expression levels and RNA processing patterns of a set of genes closely associated with inflammation, social behaviour, synaptic plasticity, and neuron survival. This study not only supports the scenario that loss-of-function of TDP-43 in mice may recapitulate key behaviour features of the FTLD diseases, but also provides a list of TDP-43 target genes/transcript isoforms useful for future therapeutic research.
- Published
- 2019
- Full Text
- View/download PDF
6. NCLcomparator: systematically post-screening non-co-linear transcripts (circular, trans-spliced, or fusion RNAs) identified from various detectors
- Author
-
Chia-Ying Chen and Trees-Juen Chuang
- Subjects
RNA-seq ,Non-co-linear RNA ,Circular RNA ,Trans-spliced RNA ,Gene fusion ,Alignment ambiguity ,Computer applications to medicine. Medical informatics ,R858-859.7 ,Biology (General) ,QH301-705.5 - Abstract
Abstract Background Non-co-linear (NCL) transcripts consist of exonic sequences that are topologically inconsistent with the reference genome in an intragenic fashion (circular or intragenic trans-spliced RNAs) or in an intergenic fashion (fusion or intergenic trans-spliced RNAs). On the basis of RNA-seq data, numerous NCL event detectors have been developed and detected thousands of NCL events in diverse species. However, there are great discrepancies in the identification results among detectors, indicating a considerable proportion of false positives in the detected NCL events. Although several helpful guidelines for evaluating the performance of NCL event detectors have been provided, a systematic guideline for measurement of NCL events identified by existing tools has not been available. Results We develop a software, NCLcomparator, for systematically post-screening the intragenic or intergenic NCL events identified by various NCL detectors. NCLcomparator first examine whether the input NCL events are potentially false positives derived from ambiguous alignments (i.e., the NCL events have an alternative co-linear explanation or multiple matches against the reference genome). To evaluate the reliability of the identified NCL events, we define the NCL score (NCL score ) based on the variation in the number of supporting NCL junction reads identified by the tools examined. Of the input NCL events, we show that the ambiguous alignment-derived events have relatively lower NCL score values than the other events, indicating that an NCL event with a higher NCL score has a higher level of reliability. To help selecting highly expressed NCL events, NCLcomparator also provides a series of useful measurements such as the expression levels of the detected NCL events and their corresponding host genes and the junction usage of the co-linear splice junctions at both NCL donor and acceptor sites. Conclusion NCLcomparator provides useful guidelines, with the input of identified NCL events from various detectors and the corresponding paired-end RNA-seq data only, to help users selecting potentially high-confidence NCL events for further functional investigation. The software thus helps to facilitate future studies into NCL events, shedding light on the fundamental biology of this important but understudied class of transcripts. NCLcomparator is freely accessible at https://github.com/TreesLab/NCLcomparator.
- Published
- 2019
- Full Text
- View/download PDF
7. Comparative Analyses of Single-Cell Transcriptomic Profiles between In Vitro Totipotent Blastomere-like Cells and In Vivo Early Mouse Embryonic Cells
- Author
-
Po-Yu Lin, Denny Yang, Chi-Hsuan Chuang, Hsuan Lin, Wei-Ju Chen, Chia-Ying Chen, Trees-Juen Chuang, Chien-Ying Lai, Long-Yuan Li, Scott C. Schuyler, Frank Leigh Lu, Yu-Chuan Liu, and Jean Lu
- Subjects
totipotency ,embryonic stem cells ,totipotent blastomere-like cells ,2-cell-like cells (2CLCs) ,spliceosome inhibitor ,pladienolide B ,Cytology ,QH573-671 - Abstract
The developmental potential within pluripotent cells in the canonical model is restricted to embryonic tissues, whereas totipotent cells can differentiate into both embryonic and extraembryonic tissues. Currently, the ability to culture in vitro totipotent cells possessing molecular and functional features like those of an early embryo in vivo has been a challenge. Recently, it was reported that treatment with a single spliceosome inhibitor, pladienolide B (plaB), can successfully reprogram mouse pluripotent stem cells into totipotent blastomere-like cells (TBLCs) in vitro. The TBLCs exhibited totipotency transcriptionally and acquired expanded developmental potential with the ability to yield various embryonic and extraembryonic tissues that may be employed as novel mouse developmental cell models. However, it is disputed whether TBLCs are ‘true’ totipotent stem cells equivalent to in vivo two-cell stage embryos. To address this question, single-cell RNA sequencing was applied to TBLCs and cells from early mouse embryonic developmental stages and the data were integrated using canonical correlation analyses. Differential expression analyses were performed between TBLCs and multi-embryonic cell stages to identify differentially expressed genes. Remarkably, a subpopulation within the TBLCs population expressed a high level of the totipotent-related genes Zscan4s and displayed transcriptomic features similar to mouse two-cell stage embryonic cells. This study underscores the subtle differences between in vitro derived TBLCs and in vivo mouse early developmental cell stages at the single-cell transcriptomic level. Our study has identified a new experimental model for stem cell biology, namely ‘cluster 3’, as a subpopulation of TBLCs that can be molecularly defined as near totipotent cells.
- Published
- 2021
- Full Text
- View/download PDF
8. Assessment of imprinting- and genetic variation-dependent monoallelic expression using reciprocal allele descendants between human family trios
- Author
-
Trees-Juen Chuang, Yu-Hsiang Tseng, Chia-Ying Chen, and Yi-Da Wang
- Subjects
Medicine ,Science - Abstract
Abstract Genomic imprinting is an important epigenetic process that silences one of the parentally-inherited alleles of a gene and thereby exhibits allelic-specific expression (ASE). Detection of human imprinting events is hampered by the infeasibility of the reciprocal mating system in humans and the removal of ASE events arising from non-imprinting factors. Here, we describe a pipeline with the pattern of reciprocal allele descendants (RADs) through genotyping and transcriptome sequencing data across independent parent-offspring trios to discriminate between varied types of ASE (e.g., imprinting, genetic variation-dependent ASE, and random monoallelic expression (RME)). We show that the vast majority of ASE events are due to sequence-dependent genetic variant, which are evolutionarily conserved and may themselves play a cis-regulatory role. Particularly, 74% of non-RAD ASE events, even though they exhibit ASE biases toward the same parentally-inherited allele across different individuals, are derived from genetic variation but not imprinting. We further show that the RME effect may affect the effectiveness of the population-based method for detecting imprinting events and our pipeline can help to distinguish between these two ASE types. Taken together, this study provides a good indicator for categorization of different types of ASE, opening up this widespread and complex mechanism for comprehensive characterization.
- Published
- 2017
- Full Text
- View/download PDF
9. Comment on 'A comprehensive overview and evaluation of circular RNA detection tools'.
- Author
-
Chia-Ying Chen and Trees-Juen Chuang
- Subjects
Biology (General) ,QH301-705.5 - Published
- 2019
- Full Text
- View/download PDF
10. Hematopoietic Transfer of the Anti-Cancer and Lifespan-Extending Capabilities of A Genetically Engineered Blood System
- Author
-
Jing-Ping Wang, Chun-Hao Hung, Yao-Huei Liou, Ching-Chen Liu, Kun-Hai Yeh, Keh-Yang Wang, Zheng-Sheng Lai, Tzu-Chi Hsu, Tung-Liang Lee, Yu-Chiau Shyu, Pei-Wen Hsiao, Liuh-Yow Chen, Trees-Juen Chuang, Chen-Hsin Albert Yu, Nah-Shih Liao, and Che-Kun James Shen
- Abstract
A causal relationship exists among the aging process, organ decay and dis-function, and the occurrence of various diseases including cancer. A genetically engineered mouse model, termedEklfK74R/K74RorEklf(K74R), carrying mutation on the well-conserved sumoylation site of the hematopoietic transcription factor KLF1/ EKLF has been generated that possesses extended lifespan and healthy characteristics including cancer resistance. We show that the high anti-cancer capability of theEklf(K74R) mice are gender-, age-and genetic background-independent. Significantly, the anti-cancer capability and extended lifespan characteristics ofEklf(K74R) mice could be transferred to wild-type mice via transplantation of their bone marrow mononuclear cells. Targeted/global gene expression profiling analysis has identified changes of the expression of specific proteins and cellular pathways in the leukocytes of theEklf(K74R) that are in the directions of anti-cancer and/or anti-aging. This study demonstrates the feasibility of developing a novel hematopoietic/ blood system for long-term anti-cancer and, potentially, for anti-aging.
- Published
- 2023
11. Large-scale benchmarking of circRNA detection tools reveals large differences in sensitivity but not in precision
- Author
-
Marieke Vromman, Jasper Anckaert, Stefania Bortoluzzi, Alessia Buratin, Chia-Ying Chen, Qinjie Chu, Trees-Juen Chuang, Roozbeh Dehghannasiri, Christoph Dieterich, Xin Dong, Paul Flicek, Enrico Gaffo, Wanjun Gu, Chunjiang He, Steve Hoffmann, Osagie Izuogu, Michael S. Jackson, Tobias Jakobi, Eric C. Lai, Justine Nuytens, Julia Salzman, Mauro Santibanez-Koref, Peter Stadler, Olivier Thas, Eveline Vanden Eynde, Kimberly Verniers, Guoxia Wen, Jakub Westholm, Li Yang, Chu-Yu Ye, Nurten Yigit, Guo-Hua Yuan, Jinyang Zhang, Fangqing Zhao, Jo Vandesompele, and Pieter-Jan Volders
- Abstract
The detection of circular RNA molecules (circRNAs) is typically based on short-read RNA sequencing data processed by computational detection tools. During the last decade, a plethora of such tools have been developed, but a systematic comparison with orthogonal validation is missing. Here, we set up a circRNA detection tool benchmarking study, in which 16 tools were used and detected over 315,000 unique circRNAs in three deeply sequenced human cell types. Next, 1,516 predicted circRNAs were empirically validated using three orthogonal methods. Generally, tool-specific precision values are high and similar (median of 98.8%, 96.3%, and 95.5% for qPCR, RNase R, and amplicon sequencing, respectively) whereas the sensitivity and number of predicted circRNAs (ranging from 1,372 to 58,032) are the most significant tool differentiators. Furthermore, we demonstrate the complementarity of tools through the increase in detection sensitivity by considering the union of highly-precise tool combinations while keeping the number of false discoveries low. Finally, based on the benchmarking results, recommendations are put forward for circRNA detection and validation.
- Published
- 2023
12. Assessing the impacts of various factors related to identification, conservation, biogenesis, and function on circular RNA reliability
- Author
-
Trees-Juen Chuang, Tai-Wei Chiang, and Chia-Ying Chen
- Abstract
Circular RNAs (circRNAs) are non-polyadenylated RNAs with a continuous loop structure characterized by a non-co-linear back-splice junction (BSJ). While dozens of computational tools have been developed and identified millions of circRNA candidates in diverse species, it remains a major challenge for determining circRNA reliability due to various types of false positives. Here, we systematically assess the impacts of numerous factors related to identification, conservation, biogenesis, and function on circRNA reliability by comparisons of circRNA expression from mock (total RNAs) and the corresponding co-linear/polyadenylated RNA-depleted datasets based on three different RNA treatment approaches. Eight important indicators of circRNA reliability are determined. The relative contribution to variability explained analyses further reveal that the relative importance of these factors in affecting circRNA reliability is conservation level of circRNA > full-length circular sequences > supporting BSJ read count > both BSJ donor and acceptor splice sites at the same co-linear transcript isoforms > both BSJ donor and acceptor splice sites at the annotated exon boundaries > BSJs detected by multiple tools > supporting functional features > both BSJ donor and acceptor splice sites undergoing alternative splicing. By extracting RT-independent circRNAs, circRNAs passing multiple experimental validations, and database-specific circRNAs, we showed the additive effects of these important factors in determining circRNA reliability. This study thus provides a useful guideline and an important resource for selecting high-confidence circRNAs for further investigations.
- Published
- 2022
13. Detecting intragenic trans-splicing events with hybrid transcriptome sequencing in cancer cells
- Author
-
Yu-Chen Chen, Chia-Ying Chen, Tai-Wei Chiang, Ming-Hsien Chan, Michael Hsiao, Huei-Mien Ke, Isheng Jason Tsai, and Trees-Juen Chuang
- Abstract
Trans-splicing can generate non-co-linear (NCL) transcripts that consist of exons in an order topologically inconsistent with the corresponding DNA template. Detecting trans-spliced RNAs (ts-RNAs) may be interfered by false positives from experimental artifacts, circular RNAs (circRNAs), and genetic rearrangements. Particularly, intragenic ts-RNAs, which are derived from separate precursor mRNA molecules of the same genes, are often mistaken for circRNAs through analyses of high-throughput transcriptome sequencing (RNA-seq) data. In addition, the biogenesis and function of ts-RNAs remain elusive. Here we developed a bioinformatics pipeline, NCLscan-hybrid, with the integration of long and short RNA-seq reads to minimize false positives and identify intragenic ts-RNAs. We utilized two features of long reads, out-of-circle and rolling circle, to distinguish intragenic ts-RNAs from circRNAs. We also designed multiple experimental validation steps to examine each type of false positives and successfully confirmed an intragenic ts-RNA (ts-ARFGEF1) in breast cancer cells. On the basis of ectopic expression and CRISPR-based endogenous genome modification experiments, we confirmed that ts-ARFGEF1 formation was significantly dependent on the reverse complementary sequences in the flanking introns of the NCL junction. Subsequent in vitro and in vivo experiments demonstrated that ts-ARFGEF1 silencing can significantly inhibit tumor cell growth. We further showed the regulatory role of ts-ARFGEF1 in p53-mediated apoptosis through affecting the PERK/eIF2a/ATF4/CHOP signaling pathway in breast cancer cells. This study thus described both bioinformatics procedures and experimental validation steps for rigorous characterization of transcriptionally non-co-linear RNAs, expanding the discovery of this important but understudied class of RNAs.
- Published
- 2022
14. Trans-genetic effects of circular RNA expression quantitative trait loci and potential causal mechanisms in autism
- Author
-
Te-Lun Mai, Chia-Ying Chen, Yu-Chen Chen, Tai-Wei Chiang, and Trees-Juen Chuang
- Subjects
Cellular and Molecular Neuroscience ,Psychiatry and Mental health ,Molecular Biology - Abstract
Genetic risk variants and transcriptional expression changes in autism spectrum disorder (ASD) were widely investigated, but their causal relationship remains largely unknown. Circular RNAs (circRNAs) are abundant in brain and often serve as upstream regulators of mRNAs. By integrating RNA-sequencing with genotype data from autistic brains, we assessed expression quantitative trait loci of circRNAs (circQTLs) that cis-regulated expression of nearby circRNAs and trans-regulated expression of distant genes (trans-eGenes) simultaneously. We thus identified 3619 circQTLs that were also trans-eQTLs and constructed 19,804 circQTL-circRNA-trans-eGene regulatory axes. We conducted two different types of approaches, mediation and partial correlation tests (MPT), to determine the axes with mediation effects of circQTLs on trans-eGene expression through circRNA expression. We showed that the mediation effects of the circQTLs (trans-eQTLs) on circRNA expression were positively correlated with the magnitude of circRNA-trans-eGene correlation of expression profile. The positive correlation became more significant after adjustment for the circQTLs. Of the 19,804 axes, 8103 passed MPT. Meanwhile, we performed causal inference test (CIT) and identified 2070 circQTL-trans-eGene-ASD diagnosis propagation paths. We showed that the CIT-passing genes were significantly enriched for ASD risk genes, genes encoding postsynaptic density proteins, and other ASD-relevant genes, supporting the relevance of the CIT-passing genes to ASD pathophysiology. Integration of MPT- and CIT-passing axes further constructed 352 circQTL-circRNA-trans-eGene-ASD diagnosis propagation paths, wherein the circRNA-trans-eGene axes may act as causal mediators for the circQTL-ASD diagnosis associations. These analyses were also successfully applied to an independent dataset from schizophrenia brains. Collectively, this study provided the first framework for systematically investigating trans-genetic effects of circQTLs and inferring the corresponding causal relations in diseases. The identified circQTL-circRNA-trans-eGene regulatory interactions, particularly the internal modules that were previously implicated in the examined disorders, also provided a helpful dataset for further investigating causative biology and cryptic regulatory mechanisms underlying the neuropsychiatric diseases.
- Published
- 2022
15. Assessing the impacts of various factors on circular RNA reliability.
- Author
-
Trees-Juen Chuang, Tai-Wei Chiang, and Chia-Ying Chen
- Published
- 2023
- Full Text
- View/download PDF
16. Trans-genetic effects of circular RNA expression quantitative trait loci and potential causal mechanisms in autism
- Author
-
Tai-Wei Chiang, Trees-Juen Chuang, Te-Lun Mai, and Chia-Ying Chen
- Subjects
Genetics ,Circular RNA ,Expression quantitative trait loci ,medicine ,Autism ,Biology ,medicine.disease - Abstract
Genetic risk variants and transcriptional expression changes in autism spectrum disorder (ASD) have been widely identified, but their causal relationship is largely unknown. Circular RNAs (circRNAs) are abundant in brain and often serve as upstream regulators of mRNAs. By integrating RNA-sequencing with genotyping data from autistic brains, we accessed expression quantitative trait loci (eQTL) of circRNAs (circQTLs) that influenced expression of distant genes (trans-eGenes) and constructed 43,372 circQTL-trans-eGene pairs. Mediation test suggested that 19,393 pairs were significantly cis-mediated by expression of circRNAs near the circQTLs; meanwhile, causal inference test (CIT) suggested 1,000 pairs influencing 708 trans-eGenes, wherein trans-eGene expression mediated trans-eQTL effects on ASD diagnosis. The 708 trans-eGenes were enriched for ASD risk genes and overrepresented in neurons in the upper neocortical layer. Integration of mediation test- and CIT-passing pairs further constructed 257 circQTL-circRNA-trans-eGene-ASD propagation paths. These findings increase our understanding of causative biology and cryptic regulatory mechanisms underlying ASD.
- Published
- 2021
17. Genome-wide, integrative analysis of circular RNA dysregulation and the corresponding circular RNA-microRNA-mRNA regulatory axes in autism
- Author
-
Te-Lun Mai, Yen-Ju Chen, Chia-Ying Chen, Trees-Juen Chuang, Sachin Kumar Gupta, Yi-Da Wang, Chih-Fan Chuang, Yu-Chen Chen, and Laising Yen
- Subjects
Autism Spectrum Disorder ,Computational biology ,Biology ,Genome ,behavioral disciplines and activities ,Cell Line ,03 medical and health sciences ,0302 clinical medicine ,Neural Stem Cells ,Circular RNA ,microRNA ,Gene expression ,mental disorders ,Genetics ,Humans ,RNA, Messenger ,Gene ,Genetics (clinical) ,030304 developmental biology ,Regulation of gene expression ,Neurons ,0303 health sciences ,Genome, Human ,Research ,RNA ,Brain ,RNA, Circular ,Gene expression profiling ,MicroRNAs ,Gene Expression Regulation ,Astrocytes ,030217 neurology & neurosurgery - Abstract
Circular RNAs (circRNAs), a class of long noncoding RNAs, are known to be enriched in mammalian neural tissues. Although a wide range of dysregulation of gene expression in autism spectrum disorder (ASD) have been reported, the role of circRNAs in ASD remains largely unknown. Here, we performed genome-wide circRNA expression profiling in postmortem brains from individuals with ASD and controls and identified 60 circRNAs and three coregulated modules that were perturbed in ASD. By integrating circRNA, microRNA, and mRNA dysregulation data derived from the same cortex samples, we identified 8170 ASD-associated circRNA-microRNA-mRNA interactions. Putative targets of the axes were enriched for ASD risk genes and genes encoding inhibitory postsynaptic density (PSD) proteins, but not for genes implicated in monogenetic forms of other brain disorders or genes encoding excitatory PSD proteins. This reflects the previous observation that ASD-derived organoids show overproduction of inhibitory neurons. We further confirmed that some ASD risk genes (NLGN1, STAG1, HSD11B1, VIP, and UBA6) were regulated by an up-regulated circRNA (circARID1A) via sponging a down-regulated microRNA (miR-204-3p) in human neuronal cells. Particularly, alteration of NLGN1 expression is known to affect the dynamic processes of memory consolidation and strengthening. To the best of our knowledge, this is the first systems-level view of circRNA regulatory networks in ASD cortex samples. We provided a rich set of ASD-associated circRNA candidates and the corresponding circRNA-microRNA-mRNA axes, particularly those involving ASD risk genes. Our findings thus support a role for circRNA dysregulation and the corresponding circRNA-microRNA-mRNA axes in ASD pathophysiology.
- Published
- 2020
18. Genome-wide, integrative analysis implicates circular RNA dysregulation in autism and the corresponding circular RNA-microRNA-mRNA regulatory axes
- Author
-
Sachin Kumar Gupta, Chia-Ying Chen, Chih-Fan Chuang, Trees-Juen Chuang, Yu-Chen Chen, Yi-Da Wang, Te-Lun Mai, and Laising Yen
- Subjects
Gene expression profiling ,Messenger RNA ,Circular RNA ,mental disorders ,microRNA ,Gene expression ,Computational biology ,Biology ,behavioral disciplines and activities ,Gene ,Genome ,Postsynaptic density - Abstract
Circular RNAs (circRNAs), a class of long non-coding RNAs, are known to be enriched in mammalian brain and neural tissues. While the effects of regulatory genetic variants on gene expression in autism spectrum disorder (ASD) have been widely reported, the role of circRNAs in ASD remains largely unknown. Here, we performed genome-wide circRNA expression profiling in post-mortem brains from individuals with ASD and controls and identified 60 circRNAs and three co-regulated modules that were perturbed in ASD. By integrating circRNA, microRNA, and mRNA dysregulation data derived from the same cortex samples, we identified 8,170 ASD-associated circRNA-microRNA-mRNA interactions. Putative targets of the axes were enriched for ASD risk genes and genes encoding inhibitory postsynaptic density (PSD) proteins, but not for genes implicated in monogenetic forms of other brain disorders or genes encoding excitatory PSD proteins. This result reflects the previous observation that ASD-derived organoids exhibit overproduction of inhibitory neurons. We further confirmed that some ASD risk genes (NLGN1, STAG1, HSD11B1, VIP, and UBA6) were indeed regulated by an upregulated circRNA (circARID1A) via sponging a downregulated microRNA (miR-204-3p) in human neuronal cells. We provided a systems-level view of landscape of circRNA regulatory networks in ASD cortex samples. We also provided multiple lines of evidence for the functional role of ASD for circRNA dysregulation and a rich set of ASD-associated circRNA candidates and the corresponding circRNA-miRNA-mRNA axes, particularly those involving ASD risk genes. Our findings thus support a role for circRNA dysregulation and the corresponding circRNA-microRNA-mRNA axes in ASD pathophysiology.
- Published
- 2019
19. Transcriptomopathies of pre- and post-symptomatic frontotemporal dementia-like mice with TDP-43 depletion in forebrain neurons
- Author
-
Yu Hsiang Tseng, Trees-Juen Chuang, Wei Cheng Cheng, Che-Kun Shen, Lien Szu Wu, Yi-Chi Wang, Chia Ying Chen, and Ming Che Wu
- Subjects
0301 basic medicine ,Aging ,Neurodegenerative ,lcsh:RC346-429 ,Transgenic ,Mice ,0302 clinical medicine ,Conditional gene knockout ,2.1 Biological and endogenous factors ,Nuclear protein ,Amyotrophic lateral sclerosis ,Aetiology ,Mice, Knockout ,Neurons ,Neurodegeneration ,Age Factors ,Frontotemporal lobar degeneration ,DNA-Binding Proteins ,Frontotemporal Dementia (FTD) ,Frontotemporal Dementia ,Neurological ,Frontotemporal dementia ,Knockout ,Clinical Sciences ,Mice, Transgenic ,Biology ,Pathology and Forensic Medicine ,03 medical and health sciences ,Cellular and Molecular Neuroscience ,Prosencephalon ,Rare Diseases ,mental disorders ,medicine ,Acquired Cognitive Impairment ,Genetics ,Animals ,lcsh:Neurology. Diseases of the nervous system ,Research ,Gene Expression Profiling ,Circular RNAs/ frontotemporal lobar degeneration/ loss-of-function/ Mis-processing/TDP-43 ,Neurosciences ,Alzheimer's Disease including Alzheimer's Disease Related Dementias (AD/ADRD) ,medicine.disease ,nervous system diseases ,Brain Disorders ,030104 developmental biology ,Forebrain ,Synaptic plasticity ,Dementia ,Neurology (clinical) ,Biochemistry and Cell Biology ,ALS ,Transcriptome ,Neuroscience ,030217 neurology & neurosurgery - Abstract
TAR DNA-binding protein (TDP-43) is a ubiquitously expressed nuclear protein, which participates in a number of cellular processes and has been identified as the major pathological factor in amyotrophic lateral sclerosis (ALS) and frontotemporal lobar degeneration (FTLD). Here we constructed a conditional TDP-43 mouse with depletion of TDP-43 in the mouse forebrain and find that the mice exhibit a whole spectrum of age-dependent frontotemporal dementia-like behaviour abnormalities including perturbation of social behaviour, development of dementia-like behaviour, changes of activities of daily living, and memory loss at a later stage of life. These variations are accompanied with inflammation, neurodegeneration, and abnormal synaptic plasticity of the mouse CA1 neurons. Importantly, analysis of the cortical RNA transcripts of the conditional knockout mice at the pre−/post-symptomatic stages and the corresponding wild type mice reveals age-dependent alterations in the expression levels and RNA processing patterns of a set of genes closely associated with inflammation, social behaviour, synaptic plasticity, and neuron survival. This study not only supports the scenario that loss-of-function of TDP-43 in mice may recapitulate key behaviour features of the FTLD diseases, but also provides a list of TDP-43 target genes/transcript isoforms useful for future therapeutic research. Electronic supplementary material The online version of this article (10.1186/s40478-019-0674-x) contains supplementary material, which is available to authorized users.
- Published
- 2019
20. Additional file 1: of Transcriptomopathies of pre- and post-symptomatic frontotemporal dementia-like mice with TDP-43 depletion in forebrain neurons
- Author
-
Lien-Szu Wu, Cheng, Wei-Cheng, Chia-Ying Chen, Ming-Che Wu, Yi-Chi Wang, Yu-Hsiang Tseng, Trees-Juen Chuang, and C.-K. Shen
- Subjects
endocrine system ,mental disorders ,nutritional and metabolic diseases ,nervous system diseases - Abstract
Figure S1. Altered activity of daily living (ADL) of TDP-43 cKO mice. Figure S2. Immunohistochemistry staining of brain slices. Figure S3. Persisting reactive astrocytosis in TDP-43 cKO mouse forebrain. Figure S4. Electrophysiology measurements. Figure S5. Mis-regulated genes in TDP-43 cKO mice. Figure S6. Alternations of the processing events of cortical RNAs in TDP-43 cKO mice. Figure S7. The alternative uses of poly(A) sites in the cortical RNAs of TDP-43 cKO mice. Figure S8. qRT-PCR validation of RNA splicing events altered in 3- and 12-month-old TDP-43 cKO mice. Figure S9. Validation of altered splicing events in TDP-43 cKO mice. Figure S10. Calcium signaling and synaptic long term potentiation pathway analysis using Ingenuity Pathway Analysis (IPA). (PDF 2990 kb)
- Published
- 2019
- Full Text
- View/download PDF
21. Additional file 3: of Transcriptomopathies of pre- and post-symptomatic frontotemporal dementia-like mice with TDP-43 depletion in forebrain neurons
- Author
-
Lien-Szu Wu, Cheng, Wei-Cheng, Chia-Ying Chen, Ming-Che Wu, Yi-Chi Wang, Yu-Hsiang Tseng, Trees-Juen Chuang, and C.-K. Shen
- Subjects
endocrine system ,nervous system ,mental disorders ,sense organs - Abstract
Table S2. Changes of the mRNA levels of social behaviour-related genes in the neocortex of TDP-43 cKO mice relative to Ctrl mice. (PDF 72 kb)
- Published
- 2019
- Full Text
- View/download PDF
22. Additional file 1: of NCLcomparator: systematically post-screening non-co-linear transcripts (circular, trans-spliced, or fusion RNAs) identified from various detectors
- Author
-
Chia-Ying Chen and Trees-Juen Chuang
- Abstract
Table S1. Parameter settings of intragenic/intergenic NCL detectors tested in this study. (DOCX 34 kb)
- Published
- 2019
- Full Text
- View/download PDF
23. Additional file 2: of Transcriptomopathies of pre- and post-symptomatic frontotemporal dementia-like mice with TDP-43 depletion in forebrain neurons
- Author
-
Lien-Szu Wu, Cheng, Wei-Cheng, Chia-Ying Chen, Ming-Che Wu, Yi-Chi Wang, Yu-Hsiang Tseng, Trees-Juen Chuang, and C.-K. Shen
- Subjects
nervous system ,mental disorders - Abstract
Table S1. List of the chromosome numbers, donor positions, acceptors positions and gene names of the transcripts from which the individual circularRNAs change in the neocortex of 3- and 12-month-old TDP-43 cKO mice, but not Ctrl mice are listed. (PDF 141 kb)
- Published
- 2019
- Full Text
- View/download PDF
24. Nonsynonymous A-to-I RNA editing contributes to burden of deleterious missense variants in healthy individuals
- Author
-
Trees-Juen Chuang and Te-Lun Mai
- Subjects
Genetics ,Nonsynonymous substitution ,education.field_of_study ,Population ,RNA ,Biology ,RNA editing ,Missense mutation ,sense organs ,1000 Genomes Project ,education ,skin and connective tissue diseases ,Gene ,Allele frequency - Abstract
ABSTARCTAdenosine-to-inosine (A-to-I) RNA editing is a very common post-transcriptional modification that can lead to A-to-G changes at the RNA level and compensate for G-to-A genomic changes to a certain extent. It has been shown that each healthy individual can carry dozens of missense variants predicted to be severely deleterious. Why strongly detrimental variants are preserved in a population and not eliminated by negative natural selection remains mostly unclear. Here we ask if RNA editing correlates with the burden of deleterious A/G polymorphisms in a population. Integrating genome and transcriptome sequencing data from 447 human lymphoblastoid cell lines, we show that nonsynonymous editing activities (prevalence/level) are negatively correlated with the deleteriousness of A-to-G genomic changes and positively correlated with that of G-to-A genomic changes within the population. We find a significantly negative correlation between nonsynonymous editing activities and allele frequency of A within the population. This negative editing-allele frequency correlation is particularly strong when editing sites are located in highly important genes/loci. Examinations of deleterious missense variants from the 1000 genomes project further show a significantly higher mutational burden in G-to-A changes than in other types of changes. The level of the mutational burden in G-to-A changes increases with increasing deleterious effects of the changes. Moreover, the deleteriousness of G-to-A changes is significantly positively correlated with the percentage of binding motif of editing enzymes at the variants. Overall, we show that nonsynonymous editing contributes to the increased burden of G-to-A missense mutations in healthy individuals, expanding RNA editing in pathogenomics studies.
- Published
- 2018
- Full Text
- View/download PDF
25. A-to-I RNA editing contributes to the persistence of predicted damaging mutations in populations
- Author
-
Trees-Juen Chuang and Te-Lun Mai
- Subjects
Nonsynonymous substitution ,Adenosine ,Population ,Mutation, Missense ,Biology ,03 medical and health sciences ,0302 clinical medicine ,Gene Frequency ,Genetics ,Missense mutation ,Humans ,education ,skin and connective tissue diseases ,Gene ,Allele frequency ,Genetics (clinical) ,030304 developmental biology ,0303 health sciences ,education.field_of_study ,Research ,RNA ,Inosine ,Enzyme binding ,RNA editing ,sense organs ,RNA Editing ,030217 neurology & neurosurgery - Abstract
Adenosine-to-inosine (A-to-I) RNA editing is a very common co-/posttranscriptional modification that can lead to A-to-G changes at the RNA level and compensate for G-to-A genomic changes to a certain extent. It has been shown that each healthy individual can carry dozens of missense variants predicted to be severely deleterious. Why strongly detrimental variants are preserved in a population and not eliminated by negative natural selection remains mostly unclear. Here, we ask if RNA editing correlates with the burden of deleterious A/G polymorphisms in a population. Integrating genome and transcriptome sequencing data from 447 human lymphoblastoid cell lines, we show that nonsynonymous editing activities (prevalence/level) are negatively correlated with the deleteriousness of A-to-G genomic changes and positively correlated with that of G-to-A genomic changes within the population. We find a significantly negative correlation between nonsynonymous editing activities and allele frequency of A within the population. This negative editing-allele frequency correlation is particularly strong when editing sites are located in highly important genes/loci. Examinations of deleterious missense variants from the 1000 Genomes Project further show a significantly higher proportion of rare missense mutations for G-to-A changes than for other types of changes. The proportion for G-to-A changes increases with increasing deleterious effects of the changes. Moreover, the deleteriousness of G-to-A changes is significantly positively correlated with the percentage of editing enzyme binding motifs at the variants. Overall, we show that nonsynonymous editing is associated with the increased burden of G-to-A missense mutations in healthy individuals, expanding RNA editing in pathogenomics studies.
- Published
- 2018
26. An Evolutionary Landscape of A-to-I RNA Editome across Metazoan Species
- Author
-
Trees-Juen Chuang, Chia-Ying Chen, Yi-Da Wang, Li-Yuan Hung, Te-Lun Mai, Min-Yu Yang, Yen-Ju Chen, and Tai-Wei Chiang
- Subjects
0301 basic medicine ,Nonsynonymous substitution ,Adenosine ,Evolutionary landscape ,Biology ,DNA sequencing ,Evolution, Molecular ,03 medical and health sciences ,0302 clinical medicine ,evolution ,Genetics ,Animals ,Cluster Analysis ,Humans ,dynamic editome ,Ecology, Evolution, Behavior and Systematics ,A-to-I RNA editing ,Sequence Analysis, RNA ,ADAR motif ,RNA ,myr ,ADAR ,Inosine ,Divergent evolution ,030104 developmental biology ,RNA editing ,Evolutionary biology ,RNA Editing ,030217 neurology & neurosurgery ,Research Article - Abstract
Adenosine-to-inosine (A-to-I) editing is widespread across the kingdom Metazoa. However, for the lack of comprehensive analysis in nonmodel animals, the evolutionary history of A-to-I editing remains largely unexplored. Here, we detect high-confidence editing sites using clustering and conservation strategies based on RNA sequencing data alone, without using single-nucleotide polymorphism information or genome sequencing data from the same sample. We thereby unveil the first evolutionary landscape of A-to-I editing maps across 20 metazoan species (from worm to human), providing unprecedented evidence on how the editing mechanism gradually expands its territory and increases its influence along the history of evolution. Our result revealed that highly clustered and conserved editing sites tended to have a higher editing level and a higher magnitude of the ADAR motif. The ratio of the frequencies of nonsynonymous editing to that of synonymous editing remarkably increased with increasing the conservation level of A-to-I editing. These results thus suggest potentially functional benefit of highly clustered and conserved editing sites. In addition, spatiotemporal dynamics analyses reveal a conserved enrichment of editing and ADAR expression in the central nervous system throughout more than 300 Myr of divergent evolution in complex animals and the comparability of editing patterns between invertebrates and between vertebrates during development. This study provides evolutionary and dynamic aspects of A-to-I editome across metazoan species, expanding this important but understudied class of nongenomically encoded events for comprehensive characterization.
- Published
- 2017
27. Biogenesis, identification, and function of exonic circular RNAs
- Author
-
Chia-Ying Chen, Iju Chen, and Trees-Juen Chuang
- Subjects
Genetics ,Exon ,microRNA ,RNA splicing ,RNA ,Biology ,Small nucleolar RNA ,Molecular Biology ,Biochemistry ,Function (biology) ,Biogenesis ,Reference genome - Abstract
Circular RNAs (circRNAs) arise during post-transcriptional processes, in which a single-stranded RNA molecule forms a circle through covalent binding. Previously, circRNA products were often regarded to be splicing intermediates, by-products, or products of aberrant splicing. But recently, rapid advances in high-throughput RNA sequencing (RNA-seq) for global investigation of nonco-linear (NCL) RNAs, which comprised sequence segments that are topologically inconsistent with the reference genome, leads to renewed interest in this type of NCL RNA (i.e., circRNA), especially exonic circRNAs (ecircRNAs). Although the biogenesis and function of ecircRNAs are mostly unknown, some ecircRNAs are abundant, highly expressed, or evolutionarily conserved. Some ecircRNAs have been shown to affect microRNA regulation, and probably play roles in regulating parental gene transcription, cell proliferation, and RNA-binding proteins, indicating their functional potential for development as diagnostic tools. To date, thousands of ecircRNAs have been identified in multiple tissues/cell types from diverse species, through analyses of RNA-seq data. However, the detection of ecircRNA candidates involves several major challenges, including discrimination between ecircRNAs and other types of NCL RNAs (e.g., trans-spliced RNAs and genetic rearrangements); removal of sequencing errors, alignment errors, and in vitro artifacts; and the reconciliation of heterogeneous results arising from the use of different bioinformatics methods or sequencing data generated under different treatments. Such challenges may severely hamper the understanding of ecircRNAs. Herein, we review the biogenesis, identification, properties, and function of ecircRNAs, and discuss some unanswered questions regarding ecircRNAs. We also evaluate the accuracy (in terms of sensitivity and precision) of some well-known circRNA-detecting methods.
- Published
- 2015
28. Assessment of imprinting- and genetic variation-dependent monoallelic expression using reciprocal allele descendants between human family trios
- Author
-
Yu-Hsiang Tseng, Chia-Ying Chen, Yi-Da Wang, and Trees-Juen Chuang
- Subjects
0301 basic medicine ,animal structures ,Genotype ,Genotyping Techniques ,Science ,Population ,Biology ,Article ,Genomic Imprinting ,03 medical and health sciences ,Genetic variation ,Humans ,Allele ,Imprinting (psychology) ,education ,Gene ,Alleles ,Family Health ,Genetics ,education.field_of_study ,Multidisciplinary ,Epigenetic Process ,Gene Expression Profiling ,Genetic Variation ,030104 developmental biology ,Medicine ,Genomic imprinting - Abstract
Genomic imprinting is an important epigenetic process that silences one of the parentally-inherited alleles of a gene and thereby exhibits allelic-specific expression (ASE). Detection of human imprinting events is hampered by the infeasibility of the reciprocal mating system in humans and the removal of ASE events arising from non-imprinting factors. Here, we describe a pipeline with the pattern of reciprocal allele descendants (RADs) through genotyping and transcriptome sequencing data across independent parent-offspring trios to discriminate between varied types of ASE (e.g., imprinting, genetic variation-dependent ASE, and random monoallelic expression (RME)). We show that the vast majority of ASE events are due to sequence-dependent genetic variant, which are evolutionarily conserved and may themselves play a cis-regulatory role. Particularly, 74% of non-RAD ASE events, even though they exhibit ASE biases toward the same parentally-inherited allele across different individuals, are derived from genetic variation but not imprinting. We further show that the RME effect may affect the effectiveness of the population-based method for detecting imprinting events and our pipeline can help to distinguish between these two ASE types. Taken together, this study provides a good indicator for categorization of different types of ASE, opening up this widespread and complex mechanism for comprehensive characterization.
- Published
- 2017
29. Is an observed non-co-linear RNA product spliced intrans, incisor justin vitro?
- Author
-
Trees-Juen Chuang, Li-Yuan Hung, Chun-Ying Yu, Hung-Chih Kuo, and Hsiao-Jung Liu
- Subjects
RNA Splicing ,Trans-splicing ,Biology ,Trans-Splicing ,Evolution, Molecular ,Mice ,Circular RNA ,Genetics ,Animals ,Humans ,Cells, Cultured ,Reverse Transcriptase Polymerase Chain Reaction ,Sequence Analysis, RNA ,Gene Expression Profiling ,Intron ,High-Throughput Nucleotide Sequencing ,RNA ,Non-coding RNA ,Macaca mulatta ,RNA editing ,RNA splicing ,RNA Splice Sites ,Artifacts ,Small nuclear RNA - Abstract
Global transcriptome investigations often result in the detection of an enormous number of transcripts composed of non-co-linear sequence fragments. Such ‘aberrant’ transcript products may arise from post-transcriptional events or genetic rearrangements, or may otherwise be false positives (sequencing/alignment errors or in vitro artifacts). Moreover, post-transcriptionally non-co-linear (‘PtNcl’) transcripts can arise from trans-splicing or back-splicing in cis (to generate so-called ‘circular RNA’). Here, we collected previously-predicted human non-co-linear RNA candidates, and designed a validation procedure integrating in silico filters with multiple experimental validation steps to examine their authenticity. We showed that >50% of the tested candidates were in vitro artifacts, even though some had been previously validated by RT-PCR. After excluding the possibility of genetic rearrangements, we distinguished between trans-spliced and circular RNAs, and confirmed that these two splicing forms can share the same non-co-linear junction. Importantly, the experimentally-confirmed PtNcl RNA events and their corresponding PtNcl splicing types (i.e. trans-splicing, circular RNA, or both sharing the same junction) were all expressed in rhesus macaque, and some were even expressed in mouse. Our study thus describes an essential procedure for confirming PtNcl transcripts, and provides further insight into the evolutionary role of PtNcl RNA events, opening up this important, but understudied, class of post-transcriptional events for comprehensive characterization.
- Published
- 2014
30. Impacts of Pretranscriptional DNA Methylation, Transcriptional Transcription Factor, and Posttranscriptional microRNA Regulations on Protein Evolutionary Rate
- Author
-
Tai-Wei Chiang and Trees-Juen Chuang
- Subjects
Transcription, Genetic ,protein evolutionary rate ,comparative genomics ,Biology ,Evolution, Molecular ,Mice ,Epigenetics of physical exercise ,Histone methylation ,Genetics ,Animals ,Humans ,Gene Regulatory Networks ,Promoter Regions, Genetic ,Gene ,Post-transcriptional regulation ,RNA-Directed DNA Methylation ,Ecology, Evolution, Behavior and Systematics ,transcription factor ,Regulation of gene expression ,microRNA ,promoter/gene body methylation ,Methylation ,DNA Methylation ,Macaca mulatta ,MicroRNAs ,Gene Expression Regulation ,DNA methylation ,Research Article ,Transcription Factors - Abstract
Gene expression is largely regulated by DNA methylation, transcription factor (TF), and microRNA (miRNA) before, during, and after transcription, respectively. Although the evolutionary effects of TF/miRNA regulations have been widely studied, evolutionary analysis of simultaneously accounting for DNA methylation, TF, and miRNA regulations and whether promoter methylation and gene body (coding regions) methylation have different effects on the rate of gene evolution remain uninvestigated. Here, we compared human-macaque and human-mouse protein evolutionary rates against experimentally determined single base-resolution DNA methylation data, revealing that promoter methylation level is positively correlated with protein evolutionary rates but negatively correlated with TF/miRNA regulations, whereas the opposite was observed for gene body methylation level. Our results showed that the relative importance of these regulatory factors in determining the rate of mammalian protein evolution is as follows: Promoter methylation ≈ miRNA regulation > gene body methylation > TF regulation, and further indicated that promoter methylation and miRNA regulation have a significant dependent effect on protein evolutionary rates. Although the mechanisms underlying cooperation between DNA methylation and TFs/miRNAs in gene regulation remain unclear, our study helps to not only illuminate the impact of these regulatory factors on mammalian protein evolution but also their intricate interaction within gene regulatory networks.
- Published
- 2014
31. Integrative transcriptome sequencing identifies trans-splicing events with important roles in human embryonic stem cell pluripotency
- Author
-
Trees-Juen Chuang, Chun-Ying Yu, Hung-Chih Kuo, Michael Hsiao, Cheng-Fu Kao, Ching-Yu Chuang, and Chan-Shuo Wu
- Subjects
Pluripotent Stem Cells ,Homeobox protein NANOG ,Trans-splicing ,Computational biology ,Biology ,Cell Line ,Trans-Splicing ,Histones ,Transcriptome ,Mice ,Chimeric RNA ,Genetics ,Animals ,Humans ,Induced pluripotent stem cell ,Embryonic Stem Cells ,Genetics (clinical) ,Oligonucleotide Array Sequence Analysis ,Homeodomain Proteins ,Genome ,Sequence Analysis, RNA ,Gene Expression Profiling ,Polycomb Repressive Complex 2 ,Gene Expression Regulation, Developmental ,High-Throughput Nucleotide Sequencing ,Reproducibility of Results ,Nanog Homeobox Protein ,Non-coding RNA ,Embryonic stem cell ,Neoplasm Proteins ,Organ Specificity ,RNA, Long Noncoding ,Software ,Transcription Factors - Abstract
Trans-splicing is a post-transcriptional event that joins exons from separate pre-mRNAs. Detection of trans-splicing is usually severely hampered by experimental artifacts and genetic rearrangements. Here, we develop a new computational pipeline, TSscan, which integrates different types of high-throughput long-/short-read transcriptome sequencing of different human embryonic stem cell (hESC) lines to effectively minimize false positives while detecting trans-splicing. Combining TSscan screening with multiple experimental validation steps revealed that most chimeric RNA products were platform-dependent experimental artifacts of RNA sequencing. We successfully identified and confirmed four trans-spliced RNAs, including the first reported trans-spliced large intergenic noncoding RNA (“tsRMST”). We showed that these trans-spliced RNAs were all highly expressed in human pluripotent stem cells and differentially expressed during hESC differentiation. Our results further indicated that tsRMST can contribute to pluripotency maintenance of hESCs by suppressing lineage-specific gene expression through the recruitment of NANOG and the PRC2 complex factor, SUZ12. Taken together, our findings provide important insights into the role of trans-splicing in pluripotency maintenance of hESCs and help to facilitate future studies into trans-splicing, opening up this important but understudied class of post-transcriptional events for comprehensive characterization.
- Published
- 2013
32. A Novel Framework for the Identification and Analysis of Duplicons between Human and Chimpanzee
- Author
-
Yao-Ting Huang, Trees-Juen Chuang, and Shian-Zu Wu
- Subjects
DNA Copy Number Variations ,Pan troglodytes ,Non-allelic homologous recombination ,lcsh:Medicine ,Biology ,Genome ,General Biochemistry, Genetics and Molecular Biology ,Segmental Duplications, Genomic ,Species Specificity ,Phylogenetics ,Gene duplication ,Animals ,Cluster Analysis ,Humans ,Copy-number variation ,Homologous Recombination ,Phylogeny ,Segmental duplication ,Genetics ,General Immunology and Microbiology ,Phylogenetic tree ,Mosaicism ,Methodology Report ,lcsh:R ,Molecular Sequence Annotation ,General Medicine ,Chromosomes, Mammalian ,Markov Chains ,Evolutionary biology ,Human genome - Abstract
Human and other primate genomes consist of many segmental duplications (SDs) due to fixation of copy number variations (CNVs). Structure of these duplications within the human genome has been shown to be a complex mosaic composed of juxtaposed subunits (called duplicons). These duplicons are difficult to be uncovered from the mosaic repeat structure. In addition, the distribution and evolution of duplicons among primates are still poorly investigated. In this paper, we develop a statistical framework for discovering duplicons via integration of a Hidden Markov Model (HMM) and a permutation test. Our comparative analysis indicates that the mosaic structure of duplicons is common in CNV/SD regions of both human and chimpanzee genomes, and a subset of core duplicons is shared by the majority of CNVs/SDs. Phylogenetic analyses using duplicons suggested that most CNVs/SDs share common duplication ancestry. Many human/chimpanzee duplicons flank both ends of CNVs, which may be hotspots of nonallelic homologous recombination.
- Published
- 2013
33. Integrative transcriptome sequencing reveals extensive alternative trans-splicing and cis-backsplicing in human cells
- Author
-
Chia-Ying Chen, Hsin-Hua Cho, Tien-Hsien Chang, Chia-Ning Shen, Hung-Chih Kuo, Tai-Wei Chiang, Yu-Ting Hsiao, Min-Yu Yang, Trees-Juen Chuang, Yi-Hua Chen, Te-Lun Mai, Yen-Ju Chen, Shan-Chi Hsieh, Mei-Yeh Lu, Tzu-Chien Kuo, Chung-Shu Yeh, and Yi-Da Wang
- Subjects
0301 basic medicine ,RNA Splicing ,Trans-splicing ,Alu element ,Computational biology ,Biology ,Trans-Splicing ,Transcriptome ,03 medical and health sciences ,Exon ,0302 clinical medicine ,Circular RNA ,Genetics ,RNA and RNA-protein complexes ,Humans ,Gene Expression Profiling ,RNA ,Exons ,RNA, Circular ,Gene expression profiling ,Alternative Splicing ,030104 developmental biology ,RNA splicing ,030217 neurology & neurosurgery - Abstract
Transcriptionally non-co-linear (NCL) transcripts can originate from trans-splicing (trans-spliced RNA; ‘tsRNA’) or cis-backsplicing (circular RNA; ‘circRNA’). While numerous circRNAs have been detected in various species, tsRNAs remain largely uninvestigated. Here, we utilize integrative transcriptome sequencing of poly(A)- and non-poly(A)-selected RNA-seq data from diverse human cell lines to distinguish between tsRNAs and circRNAs. We identified 24,498 NCL events and found that a considerable proportion (20–35%) of them arise from both tsRNAs and circRNAs, representing extensive alternative trans-splicing and cis-backsplicing in human cells. We show that sequence generalities of exon circularization are also observed in tsRNAs. Recapitulation of NCL RNAs further shows that inverted Alu repeats can simultaneously promote the formation of tsRNAs and circRNAs. However, tsRNAs and circRNAs exhibit quite different, or even opposite, expression patterns, in terms of correlation with the expression of their co-linear counterparts, expression breadth/abundance, transcript stability, and subcellular localization preference. These results indicate that tsRNAs and circRNAs may play different regulatory roles and analysis of NCL events should take the joint effects of different NCL-splicing types and joint effects of multiple NCL events into consideration. This study describes the first transcriptome-wide analysis of trans-splicing and cis-backsplicing, expanding our understanding of the complexity of the human transcriptome.
- Published
- 2016
34. Purifying selection shapes the coincident SNP distribution of primate coding sequences
- Author
-
Chan-Shuo Wu, Chia-Ying Chen, Trees-Juen Chuang, and Li-Yuan Hung
- Subjects
0301 basic medicine ,Genetics ,Mutation rate ,Multidisciplinary ,Pan troglodytes ,Sequence analysis ,Single-nucleotide polymorphism ,Sequence Analysis, DNA ,Biology ,Polymorphism, Single Nucleotide ,Article ,Evolution, Molecular ,03 medical and health sciences ,Negative selection ,030104 developmental biology ,Mutation Rate ,Databases, Genetic ,Coding region ,SNP ,Animals ,Humans ,Selection, Genetic ,Selection (genetic algorithm) ,Sequence (medicine) - Abstract
Genome-wide analysis has observed an excess of coincident single nucleotide polymorphisms (coSNPs) at human-chimpanzee orthologous positions and suggested that this is due to cryptic variation in the mutation rate. While this phenomenon primarily corresponds with non-coding coSNPs, the situation in coding sequences remains unclear. Here we calculate the observed-to-expected ratio of coSNPs (coSNPO/E) to estimate the prevalence of human-chimpanzee coSNPs and show that the excess of coSNPs is also present in coding regions. Intriguingly, coSNPO/E is much higher at zero-fold than at nonzero-fold degenerate sites; such a difference is due to an elevation of coSNPO/E at zero-fold degenerate sites, rather than a reduction at nonzero-fold degenerate ones. These trends are independent of chimpanzee subpopulation, population size, or sequencing techniques; and hold in broad generality across primates. We find that this discrepancy cannot fully explained by sequence contexts, shared ancestral polymorphisms, SNP density and recombination rate and that coSNPO/E in coding sequences is significantly influenced by purifying selection. We also show that selection and mutation rate affect coSNPO/E independently and coSNPs tend to be less damaging and more correlated with human diseases than non-coSNPs. These suggest that coSNPs may represent a “signature” during primate protein evolution.
- Published
- 2016
35. Position-dependent correlations between DNA methylation and the evolutionary rates of mammalian coding exons
- Author
-
Feng-Chi Chen, Trees-Juen Chuang, and Yen-Zho Chen
- Subjects
Nonsynonymous substitution ,Mutation, Missense ,Biology ,Cell Line ,Evolution, Molecular ,Mice ,Open Reading Frames ,chemistry.chemical_compound ,Species Specificity ,Animals ,Humans ,Epigenetics ,Gene ,Genetics ,Regulation of gene expression ,Multidisciplinary ,Exons ,Biological Sciences ,DNA Methylation ,Gene Expression Regulation ,CpG site ,chemistry ,DNA methylation ,Macaca ,Synonymous substitution ,DNA - Abstract
DNA cytosine methylation is a central epigenetic marker that is usually mutagenic and may increase the level of sequence divergence. However, methylated genes have been reported to evolve more slowly than unmethylated genes. Hence, there is a controversy on whether DNA methylation is correlated with increased or decreased protein evolutionary rates. We hypothesize that this controversy has resulted from the differential correlations between DNA methylation and the evolutionary rates of coding exons in different genic positions. To test this hypothesis, we compare human–mouse and human–macaque exonic evolutionary rates against experimentally determined single-base resolution DNA methylation data derived from multiple human cell types. We show that DNA methylation is significantly related to within-gene variations in evolutionary rates. First, DNA methylation level is more strongly correlated with C-to-T mutations at CpG dinucleotides in the first coding exons than in the internal and last exons, although it is positively correlated with the synonymous substitution rate in all exon positions. Second, for the first exons, DNA methylation level is negatively correlated with exonic expression level, but positively correlated with both nonsynonymous substitution rate and the sample specificity of DNA methylation level. For the internal and last exons, however, we observe the opposite correlations. Our results imply that DNA methylation level is differentially correlated with the biological (and evolutionary) features of coding exons in different genic positions. The first exons appear more prone to the mutagenic effects, whereas the other exons are more influenced by the regulatory effects of DNA methylation.
- Published
- 2012
36. Evolutionary expansion of SPOP and associated TD/POZ gene family: Impact of evolutionary route on gene expression pattern
- Author
-
Yao-Hui Tsai, Trees-Juen Chuang, Kong-Bung Choo, Chiu-Jung Huang, Che-Ming Chang, and Wan-Yi Lin
- Subjects
Transcriptional Activation ,Molecular Sequence Data ,Gene Expression ,Sequence alignment ,Retrotransposon ,SPOP ,Biology ,Genome ,Cell Line ,Evolution, Molecular ,Mice ,Cell Line, Tumor ,Histocompatibility Antigens ,Gene expression ,Genetics ,Animals ,Humans ,Gene family ,Amino Acid Sequence ,Gene ,Segmental duplication ,Nuclear Proteins ,General Medicine ,Rats ,Repressor Proteins ,DNA Transposable Elements ,Sequence Alignment - Abstract
Evolutionary expansion of a gene family may occur at both the DNA and RNA levels. The rat testis-specific Rtdpoz-T2 and -T1 (rT2 and rT1) retrogenes are members of the TD/POZ gene family which also includes the well-characterized SPOP gene. In this study, rT2/rT1 transcriptional activation in cancer cells is demonstrated; the cancer rT2/rT1 transcripts are structurally similar to the embryonic transcripts reported previously in frequent exonization of transposed elements. On database interrogation, we have identified an uncharacterized rT2/rT1-like SPOP paralog, designated as SPOP-like (SPOPL), in the human and rodent genomes. Ka/Ks analysis indicates that the SPOPL genes are under functional constraints implicating biological functions. Phylogenetic analyses further suggest that segmental duplication and retrotransposition events had occurred giving rise to new gene members or retrogenes in the human-rodent ancestors during the evolution of the TD/POZ gene family. Based on this and previous works, a model is proposed to map the routes of evolutionary expansion of the TD/POZ gene family. More importantly, different gene expression patterns of members of the family are depicted: intron-harboring members are ubiquitously expressed whereas retrogenes are expressed in tissue-specific and developmentally regulated manner, and are fortuitously re-activated in cancer cells involving exonization of transposed elements.
- Published
- 2010
37. Scanning for the Signatures of Positive Selection for Human-Specific Insertions and Deletions
- Author
-
Chun-Hsi Chen, Feng-Chi Chen, Ben-Yang Liao, and Trees-Juen Chuang
- Subjects
False discovery rate ,Genetics ,recent selective sweep ,Positive selection ,food and beverages ,Population genetics ,human-specific indels ,Biology ,positive selection ,Human genome ,Letters ,Gene conversion ,Selective sweep ,Indel ,Gene ,Ecology, Evolution, Behavior and Systematics - Abstract
Human-specific small insertions and deletions (HS indels, with lengths
- Published
- 2009
38. Plant Gene and Alternatively Spliced Variant Annotator. A Plant Genome Annotation Pipeline for Rice Gene and Alternatively Spliced Variant Identification with Cross-Species Expressed Sequence Tag Conservation from Seven Plant Species
- Author
-
Trees-Juen Chuang, Shu-Miaw Chaw, Feng-Chi Chen, Yao-Ting Huang, and Sheng-Shun Wang
- Subjects
Bioinformatics ,Physiology ,Molecular Sequence Data ,Plant Science ,Biology ,Genome ,Evolution, Molecular ,Databases, Genetic ,Genetics ,Protein Isoforms ,Gene ,Conserved Sequence ,Expressed Sequence Tags ,Internet ,Expressed sequence tag ,Oryza sativa ,Base Sequence ,Alternative splicing ,food and beverages ,Oryza ,Genome project ,Gene Annotation ,Plants ,Hordeum vulgare ,Genome, Plant ,Software - Abstract
The completion of the rice (Oryza sativa) genome draft has brought unprecedented opportunities for genomic studies of the world's most important food crop. Previous rice gene annotations have relied mainly on ab initio methods, which usually yield a high rate of false-positive predictions and give only limited information regarding alternative splicing in rice genes. Comparative approaches based on expressed sequence tags (ESTs) can compensate for the drawbacks of ab initio methods because they can simultaneously identify experimental data-supported genes and alternatively spliced transcripts. Furthermore, cross-species EST information can be used to not only offset the insufficiency of same-species ESTs but also derive evolutionary implications. In this study, we used ESTs from seven plant species, rice, wheat (Triticum aestivum), maize (Zea mays), barley (Hordeum vulgare), sorghum (Sorghum bicolor), soybean (Glycine max), and Arabidopsis (Arabidopsis thaliana), to annotate the rice genome. We developed a plant genome annotation pipeline, Plant Gene and Alternatively Spliced Variant Annotator (PGAA). Using this approach, we identified 852 genes (931 isoforms) not annotated in other widely used databases (i.e. the Institute for Genomic Research, National Center for Biotechnology Information, and Rice Annotation Project) and found 87% of them supported by both rice and nonrice EST evidence. PGAA also identified more than 44,000 alternatively spliced events, of which approximately 20% are not observed in the other three annotations. These novel annotations represent rich opportunities for rice genome research, because the functions of most of our annotated genes are currently unknown. Also, in the PGAA annotation, the isoforms with non-rice-EST-supported exons are significantly enriched in transporter activity but significantly underrepresented in transcription regulator activity. We have also identified potential lineage-specific and conserved isoforms, which are important markers in evolutionary studies. The data and the Web-based interface, RiceViewer, are available for public access at http://RiceViewer.genomics.sinica.edu.tw/.
- Published
- 2007
39. Biogenesis, identification, and function of exonic circular RNAs
- Author
-
Iju, Chen, Chia-Ying, Chen, and Trees-Juen, Chuang
- Subjects
Advanced Review ,Computational Analyses of RNA ,Splicing Regulation/Alternative Splicing ,Animals ,Humans ,Nucleic Acid Conformation ,RNA ,Advanced Reviews ,Exons ,RNA Processing, Post-Transcriptional ,RNA Structure and Dynamics - Abstract
Circular RNAs (circRNAs) arise during post‐transcriptional processes, in which a single‐stranded RNA molecule forms a circle through covalent binding. Previously, circRNA products were often regarded to be splicing intermediates, by‐products, or products of aberrant splicing. But recently, rapid advances in high‐throughput RNA sequencing (RNA‐seq) for global investigation of nonco‐linear (NCL) RNAs, which comprised sequence segments that are topologically inconsistent with the reference genome, leads to renewed interest in this type of NCL RNA (i.e., circRNA), especially exonic circRNAs (ecircRNAs). Although the biogenesis and function of ecircRNAs are mostly unknown, some ecircRNAs are abundant, highly expressed, or evolutionarily conserved. Some ecircRNAs have been shown to affect microRNA regulation, and probably play roles in regulating parental gene transcription, cell proliferation, and RNA‐binding proteins, indicating their functional potential for development as diagnostic tools. To date, thousands of ecircRNAs have been identified in multiple tissues/cell types from diverse species, through analyses of RNA‐seq data. However, the detection of ecircRNA candidates involves several major challenges, including discrimination between ecircRNAs and other types of NCL RNAs (e.g., trans‐spliced RNAs and genetic rearrangements); removal of sequencing errors, alignment errors, and in vitro artifacts; and the reconciliation of heterogeneous results arising from the use of different bioinformatics methods or sequencing data generated under different treatments. Such challenges may severely hamper the understanding of ecircRNAs. Herein, we review the biogenesis, identification, properties, and function of ecircRNAs, and discuss some unanswered questions regarding ecircRNAs. We also evaluate the accuracy (in terms of sensitivity and precision) of some well‐known circRNA‐detecting methods. WIREs RNA 2015, 6:563–579. doi: 10.1002/wrna.1294 For further resources related to this article, please visit the WIREs website.
- Published
- 2015
40. ESTviewer: a web interface for visualizing mouse, rat, cattle, pig and chicken conserved ESTs in human genes and human alternatively spliced variants
- Author
-
Feng-Chi Chen and Trees-Juen Chuang
- Subjects
Statistics and Probability ,Swine ,Computational biology ,Biology ,Biochemistry ,Evolution, Molecular ,Mice ,User-Computer Interface ,Exon ,Annotation ,Species Specificity ,Sequence Homology, Nucleic Acid ,Computer Graphics ,Animals ,Humans ,Molecular Biology ,Gene ,Conserved Sequence ,Expressed Sequence Tags ,Genetics ,Internet ,Expressed sequence tag ,Structural gene ,Alternative splicing ,Chromosome Mapping ,food and beverages ,Sequence Analysis, DNA ,Rats ,Computer Science Applications ,Computational Mathematics ,Computational Theory and Mathematics ,Genetic marker ,Cattle ,Human genome ,Chickens ,Software - Abstract
Summary: ESTviewer is a web application for interactively visualizing human gene structures, with emphasis on mammalian and avian expressed sequence tags (ESTs) that are conserved in the human genome and alternatively spliced (AS) variants. AS variants from the UCSC, Vega and PSEP annotations are presented in this application for comparison. EST data from six species, human, mouse, rat, cattle, pig and chicken, are mapped to the human genome to show cross-species EST conservation in annotated exonic and intronic regions. Cross-species EST conservation is evolutionarily and functionally important because it represents the effects of selection pressure on genic regions and transcriptome over evolutionary time. Emphatically, ESTviewer provides a convenient tool to compare highly conserved non-human ESTs and human AS variants. The application takes human gene accession Ids or coordinates of genomic sequences as inputs and presents annotated gene structures and their AS variants. In addition, the lengths and percentages of human genic regions covered by ESTs are displayed to show the level of EST coverage of different species. The percentages of the UCSC, Vega and PSEP annotated exons covered by ESTs of the six studied species are also displayed in the interface. Availability: The ESTviewer web interface is publicly accessible at http://www.gate.sinica.edu.tw/~trees/ESTviewer/ESTviewer.htm Contact: trees@gate.sinica.edu.tw Supplementary information: Detailed documentation and the data sets, including the whole human genome annotation of PSEP and 6-species ESTs conserved in the human genome, can be found on the ESTviewer home page.
- Published
- 2005
41. A Complexity Reduction Algorithm for Analysis and Annotation of Large Genomic Sequences
- Author
-
Wen-chang Lin, Danny Shieh, Simon Lin, Zi-Hao Wang, Chi-Wei Wang, H. Lee, Lan-Yang Ch'ang, Keh-Lin Hsiao, and Trees-Juen Chuang
- Subjects
DNA, Complementary ,Chromosomes, Human, Pair 21 ,Sequence analysis ,Chromosomes, Human, Pair 22 ,Gene prediction ,Sequence alignment ,Biology ,Sensitivity and Specificity ,Genome ,Annotation ,Sequence Homology, Nucleic Acid ,Methods ,Genetics ,Humans ,Pattern matching ,Genetics (clinical) ,Expressed Sequence Tags ,Genome, Human ,Reproducibility of Results ,DNA ,Exons ,Sequence Analysis, DNA ,Genome project ,Data structure ,Genes ,Sequence Alignment ,Algorithm ,Algorithms ,Pseudogenes - Abstract
The entire human genome has been sequenced and annotated separately by Lander et al. (2001) and Venter et al. (2001). Altogether, 30,000 to 40,000 protein-coding genes were annotated from the genomic sequence. This number, roughly twice as many as in the worm or fly, deviates greatly from the earlier high estimates (Ewing and Green 2000; Liang et al. 2000; Roest Crollius et al. 2000). The exact gene number in the human genome remains to be determined by accurate annotation of the sequence data. Genome annotation is based primarily on the ab initio and homology methods. The ab initio approach predicts genes directly from the genomic sequence using the computational properties of exons, introns, and other signature features without referencing the experimental data. Numerous ab initio prediction programs have been used extensively in genome annotation, including FGENESH (Solovyev et al. 1995; Salamov and Solovyev 2000), GeneID (Parra et al. 2000), GeneMark.hmm (Lukashin and Borodovsky 1998), GeneView (Milanesi et al. 1993), GENSCAN (Burge and Karlin 1997, 1998), Genie (Kulp et al. 1996; Reese et al. 2000), Grail (Xu et al. 1994), GrailEXP_Perceval (Hyatt et al. 2000), HMMgene (Krogh 1998, 2000), and MZEF (Zhang 1997). The homology approach identifies genes with the aid of experimental data. This approach exploits sequence alignment between the genomic data and known cDNA or protein databases. Successful implementation of this method includes AAT (Huang et al. 1997), FGENESH+ and FGENESH++ (Salamov and Solovyev 2000), GAIA (Bailey et al. 1998), GeneBuilder (Milanesi et al. 1999), GenomeScan (Yeh et al. 2001), GrailEXP_Gawain (Hyatt et al. 2000), GeneWise (Birney and Durbin 2000), ICE (Pachter et al. 1999), and Procrustes (Gelfand et al. 1996; Sze and Pevzner 1997; Mironov et al. 1998). Among these programs FGENESH+ (and FGENESH++), GenomeScan, GeneWise, and Procrustes are combined tools of sequence homology and ab initio annotation. Generally speaking, the ab initio approach tends to have a higher rate of false-positive predictions (overprediction) in annotating long genomic sequences with multiple genes (Dunham et al. 1999). The homology-based approaches demand high-performance computing and large storage space. Furthermore, these methods require extensive manual interventions to curate true gene prediction from large sets of matched data. The combination tools for sequence alignment and ab initio annotation, although highly accurate, are not robust in routine applications. In this paper, we propose a new method, the Complexity Reduction Algorithm for Sequence Analysis (CRASA), for global alignment and annotation of the genomic sequence. The method finds the exact match between the cDNA data and genomic sequence; thus mapping the expressed genes directly to it. By using a set of filters, the enormous data complexity is reduced substantially. Thus, it provides an annotated framework of expressed genes in the genome. The CRASA system restructures the cDNA data progressively into a pattern-based pyramidal data structure in hierarchical orders. The algorithm offers an automatic search of the entire database efficiently and is amicable to the implementation of parallel processing (see Methods). In this paper, CRASA was tested with two benchmark data sets, the SemiArtificial Genomic (SAG) sequences provided by Guigo et al. (2000) and the Real Genomic (RG) sequences generated ad hoc from GeneBank of NCBI (National Center of Biotechnolgy Information). In general, CRASA was capable of delivering annotation accuracy better than the other 15 programs tested in this study (see Results and Discussion). The annotated human Chromosomes 21 (Hattori et al. 2000) and 22 (Dunham et al. 1999), although incomplete, are considered as standard benchmarks for genome annotation. In the benchmark test of human Chromosomes 21 and 22, CRASA's filters were able to remove the massive noise from matched hits, thus reducing the complexity of genome analysis. More significantly, our method identified 83 additional EST matches that were not annotated previously. These 83 matches were extracted and categorized into five classes. Our results indicate that CRASA, with its capabilities of complexity reduction, progressive data transmission, and direct pattern match, is a robust and effective new method for genome annotation. The simplicity of program implementation allows for unlimited query size and parallel processing on multiple processors. It is well suited for annotating large genomic sequences.
- Published
- 2003
42. Comparative genomics of grass EST libraries reveals previously uncharacterized splicing events in crop plants
- Author
-
Min-Yu Yang, Li-Yuan Hung, Chuang-Chieh Lin, Ping-Hung Hsieh, and Trees-Juen Chuang
- Subjects
Crops, Agricultural ,Bioinformatics ,Genomics ,Plant Science ,Biology ,Poaceae ,Real-Time Polymerase Chain Reaction ,Genome ,Zea mays ,Crop plants ,Protein Isoforms ,Gene ,Sorghum ,Plant Proteins ,Genetics ,Comparative genomics ,Expressed Sequence Tags ,Expressed sequence tag ,Oryza sativa ,Methodology Article ,Alternative splicing ,food and beverages ,Oryza ,Exons ,Evolutionary rate ,Alternative Splicing ,RNA splicing ,Plant transcriptome evolution ,Genome, Plant - Abstract
Background Crop plants such as rice, maize and sorghum play economically-important roles as main sources of food, fuel, and animal feed. However, current genome annotations of crop plants still suffer false-positive predictions; a more comprehensive registry of alternative splicing (AS) events is also in demand. Comparative genomics of crop plants is largely unexplored. Results We performed a large-scale comparative analysis (ExonFinder) of the expressed sequence tag (EST) library from nine grass plants against three crop genomes (rice, maize, and sorghum) and identified 2,879 previously-unannotated exons (i.e., novel exons) in the three crops. We validated 81% of the tested exons by RT-PCR-sequencing, supporting the effectiveness of our in silico strategy. Evolutionary analysis reveals that the novel exons, comparing with their flanking annotated ones, are generally under weaker selection pressure at the protein level, but under stronger pressure at the RNA level, suggesting that most of the novel exons also represent novel alternatively spliced variants (ASVs). However, we also observed the consistency of evolutionary rates between certain novel exons and their flanking exons, which provided further evidence of their co-occurrence in the transcripts, suggesting that previously-annotated isoforms might be subject to erroneous predictions. Our validation showed that 54% of the tested genes expressed the newly-identified isoforms that contained the novel exons, rather than the previously-annotated isoforms that excluded them. The consistent results were steadily observed across cultivated (Oryza sativa and O. glaberrima) and wild (O. rufipogon and O. nivara) rice species, asserting the necessity of our curation of the crop genome annotations. Our comparative analyses also inferred the common ancestral transcriptome of grass plants and gain- and loss-of-ASV events. Conclusions We have reannotated the rice, maize, and sorghum genomes, and showed that evolutionary rates might serve as an indicator for determining whether the identified exons were alternatively spliced. This study not only presents an effective in silico strategy for the improvement of plant annotations, but also provides further insights into the role of AS events in the evolution and domestication of crop plants. ExonFinder and the novel exons/ASVs identified are publicly accessible at http://exonfinder.sourceforge.net/. Electronic supplementary material The online version of this article (doi:10.1186/s12870-015-0431-7) contains supplementary material, which is available to authorized users.
- Published
- 2014
43. Identification and analysis of ancestral hominoid transcriptome inferred from cross-species transcript and processed pseudogene comparisons
- Author
-
Yao-Ting Huang, Chiuan-Jung Chen, Feng-Chi Chen, Hsin-Liang Chen, and Trees-Juen Chuang
- Subjects
Pseudogenes -- Research ,Genetic translation -- Research ,Genetic transcription -- Research ,Health - Abstract
The development of a new method to comparatively extract novel transcripts from processed pseudogenes (PPGs) and identify 643 novel human exons/alternatively spliced variants is reported. The novel exons identified based on chimpanzee transcripts are significantly enriched in genes related to translation regulatory activity and viral life cycle.
- Published
- 2008
44. CNVVdb: a database of copy number variations across vertebrate genomes
- Author
-
Feng-Chi Chen, Trees-Juen Chuang, and Yen-Zho Chen
- Subjects
Statistics and Probability ,Pseudogene ,Gene Dosage ,Genomics ,Single-nucleotide polymorphism ,Computational biology ,Biology ,Polymorphism, Single Nucleotide ,Biochemistry ,Gene dosage ,Genome ,Databases, Genetic ,Animals ,Humans ,Copy-number variation ,Molecular Biology ,Gene ,Sequence (medicine) ,Genetics ,Internet ,Genetic Variation ,Genome Analysis ,Computer Science Applications ,Applications Note ,Computational Mathematics ,Computational Theory and Mathematics ,Software - Abstract
Summary: CNVVdb is a web interface for identification of putative copy number variations (CNVs) among 16 vertebrate species using the-same-species self-alignments and cross-species pairwise alignments. By querying genomic coordinates in the target species, all the potential paralogous/orthologous regions that overlap ≥80–100% (adjustable) of the query sequences with user-specified sequence identity (≥60%∼≥90%) are returned. Additional information is also given for the genes that are included in the returned regions, including gene description, alternatively spliced transcripts, gene ontology descriptions and other biologically important information. CNVVdb also provides information of pseudogenes and single nucleotide polymorphisms (SNPs) for the CNV-related genomic regions. Moreover, multiple sequence alignments of shared CNVs across species are also provided. With the combination of CNV, SNP, pseudogene and functional information, CNVVdb can be very useful for comparative and functional studies in vertebrates. Availability: CNVVdb is freely accessible at http://CNVVdb.genomics.sinica.edu.tw. Contact: trees@gate.sinica.edu.tw
- Published
- 2009
45. Human-specific insertions and deletions inferred from mammalian genome sequences
- Author
-
Feng-Chi Chen, Chueng-Jong Chen, Wen-Hsiung Li, and Trees-Juen Chuang
- Subjects
Chimpanzees -- Genetic aspects ,Chimpanzees -- Research ,Insertion elements, DNA -- Research ,Mammals -- Genetic aspects ,Mammals -- Research ,Nucleotide sequencing -- Analysis ,Health - Abstract
The false-specific rates of human-specific insertions and deletions (indels) inferred from human-chimpanzee pairwise sequence alignments are investigated with the use of several multiple sequence alignments of mammalian genomes. The results show that the genes that are affected by such indels are highly rich in various regulatory activities, but are underrepresented in other functional processes, leading to changes at the RNA and protein level.
- Published
- 2007
46. Mathematical properties of some measures of evolutionary distance
- Author
-
Trees-Juen Chuang, Wen-Hsiung Li, and Yun-Huei Tzeng
- Subjects
Statistics and Probability ,General Immunology and Microbiology ,Applied Mathematics ,Modeling and Simulation ,Mathematical properties ,Amino acid substitution ,General Medicine ,Biological evolution ,General Agricultural and Biological Sciences ,Biological system ,General Biochemistry, Genetics and Molecular Biology ,Mathematics - Published
- 2007
47. A NEW ALGORITHM FOR LOSSLESS STILL IMAGE COMPRESSION
- Author
-
Trees-Juen Chuang and Ja-Chen Lin
- Subjects
Lossless compression ,JBIG2 ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Image processing ,Data compression ratio ,Data_CODINGANDINFORMATIONTHEORY ,computer.file_format ,Lossy compression ,Grayscale ,JPEG ,Artificial Intelligence ,Signal Processing ,Computer Vision and Pattern Recognition ,JBIG ,computer ,Lossless JPEG ,Algorithm ,Software ,Image compression ,Data compression - Abstract
This paper presents a spatial domain method for lossless still image compression using a new scheme: base switching (BS). The given image is partitioned into non-overlapping fixed-size subimages. Di⁄erent subimages then get di⁄erent compression ratios according to the base values of the subimages. In order to increase the compression ratio, a hierarchical technique is also used. It is found that the compression ratio of the proposed algorithm can compete with that of the VBSS and the international standard algorithms known as JBIG and Lossless JPEG. In addition, when the BS method is compared with the S#P method, which is an excellent frequency domain method that used EZW, although S#P method gains about 9% increase in the compression ratio, its encoding time (excluding I/O) is about three times longer than ours. The math theory needed to build up the proposed compression scheme is also provided. ( 1998 Published by Elsevier Science Ltd on behalf of the Pattern Recognition Society. All rights reserved
- Published
- 1998
48. An Evolutionary Landscape of A-to-I RNA Editome across Metazoan Species.
- Author
-
Li-Yuan Hung, Yen-Ju Chen, Te-Lun Mai, Chia-Ying Chen, Min-Yu Yang, Tai-Wei Chiang, Yi-Da Wang, and Trees-Juen Chuang
- Subjects
ADENOSINES ,INOSINE ,RNA sequencing ,METAZOA evolution ,GENOME editing - Abstract
Adenosine-to-inosine (A-to-I) editing is widespread across the kingdom Metazoa. However, for the lack of comprehensive analysis in nonmodel animals, the evolutionary history of A-to-I editing remains largely unexplored. Here, we detect high-confidence editing sites using clustering and conservation strategies based on RNA sequencing data alone, without using single-nucleotide polymorphism information or genome sequencing data from the same sample. We thereby unveil the first evolutionary landscape of A-to-I editing maps across 20 metazoan species (from worm to human), providing unprecedented evidence on how the editing mechanism gradually expands its territory and increases its influence along the history of evolution. Our result revealed that highly clustered and conserved editing sites tended to have a higher editing level and a higher magnitude of the ADAR motif. The ratio of the frequencies of nonsynonymous editing to that of synonymous editing remarkably increased with increasing the conservation level of A-to-I editing. These results thus suggest potentially functional benefit of highly clustered and conserved editing sites. In addition, spatiotemporal dynamics analyses reveal a conserved enrichment of editing and ADAR expression in the central nervous system throughout more than 300 Myr of divergent evolution in complex animals and the comparability of editing patterns between invertebrates and between vertebrates during development. This study provides evolutionary and dynamic aspects of A-to-I editome across metazoan species, expanding this important but understudied class of nongenomically encoded events for comprehensive characterization. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
49. A complexity reduction algorithm for analysis and annotation of large genomic sequences
- Author
-
Trees-Juen Chuang, Wen-Chang Lin, Hurng-Chun Lee, Chi-Wei Wang, Keh-Lin Hsiao, Zi-Hao Wang, Shieh, Danny, Lin, Simon C., and Lan-Yang Ch'ang
- Subjects
Genomes -- Research ,Genetic research -- Analysis ,Genetic algorithms -- Usage ,DNA -- Genetic aspects ,Health - Abstract
Research has been conducted on genomic sequence. The authors describe the method developed for genomic sequence global alignment and annotation, and report that this methos can find the match betweeb cDNA data and genomic sequence.
- Published
- 2003
50. LDGIdb: a database of gene interactions inferred from long-range strong linkage disequilibrium between pairs of SNPs
- Author
-
Feng-Chi Chen, Yao-Ting Huang, Ming-Chih Wang, Yen-Zho Chen, and Trees-Juen Chuang
- Subjects
Linkage disequilibrium ,lcsh:Medicine ,Single-nucleotide polymorphism ,Genome-wide association study ,Biology ,Data Note ,computer.software_genre ,Polymorphism, Single Nucleotide ,Linkage Disequilibrium ,General Biochemistry, Genetics and Molecular Biology ,User-Computer Interface ,03 medical and health sciences ,0302 clinical medicine ,Gene interaction ,Databases, Genetic ,Cluster Analysis ,Humans ,SNP ,Genetic Predisposition to Disease ,International HapMap Project ,Promoter Regions, Genetic ,lcsh:Science (General) ,lcsh:QH301-705.5 ,Genetic Association Studies ,030304 developmental biology ,Genetic association ,Medicine(all) ,Genetics ,0303 health sciences ,Database ,Biochemistry, Genetics and Molecular Biology(all) ,Systems Biology ,lcsh:R ,Computational Biology ,Exons ,General Medicine ,Genetic hitchhiking ,Haplotypes ,lcsh:Biology (General) ,computer ,030217 neurology & neurosurgery ,lcsh:Q1-390 - Abstract
Background Complex human diseases may be associated with many gene interactions. Gene interactions take several different forms and it is difficult to identify all of the interactions that are potentially associated with human diseases. One approach that may fill this knowledge gap is to infer previously unknown gene interactions via identification of non-physical linkages between different mutations (or single nucleotide polymorphisms, SNPs) to avoid hitchhiking effect or lack of recombination. Strong non-physical SNP linkages are considered to be an indication of biological (gene) interactions. These interactions can be physical protein interactions, regulatory interactions, functional compensation/antagonization or many other forms of interactions. Previous studies have shown that mutations in different genes can be linked to the same disorders. Therefore, non-physical SNP linkages, coupled with knowledge of SNP-disease associations may shed more light on the role of gene interactions in human disorders. A user-friendly web resource that integrates information about non-physical SNP linkages, gene annotations, SNP information, and SNP-disease associations may thus be a good reference for biomedical research. Findings Here we extracted the SNPs located within the promoter or exonic regions of protein-coding genes from the HapMap database to construct a database named the L inkage-D isequilibrium-based G ene I nteraction d atab ase (LDGIdb). The database stores 646,203 potential human gene interactions, which are potential interactions inferred from SNP pairs that are subject to long-range strong linkage disequilibrium (LD), or non-physical linkages. To minimize the possibility of hitchhiking, SNP pairs inferred to be non-physically linked were required to be located in different chromosomes or in different LD blocks of the same chromosomes. According to the genomic locations of the involved SNPs (i.e., promoter, untranslated region (UTR) and coding region (CDS)), the SNP linkages inferred were categorized into promoter-promoter, promoter-UTR, promoter-CDS, CDS-CDS, CDS-UTR and UTR-UTR linkages. For the CDS-related linkages, the coding SNPs were further classified into nonsynonymous and synonymous variations, which represent potential gene interactions at the protein and RNA level, respectively. The LDGIdb also incorporates human disease-association databases such as Genome-Wide Association Studies (GWAS) and Online Mendelian Inheritance in Man (OMIM), so that the user can search for potential disease-associated SNP linkages. The inferred SNP linkages are also classified in the context of population stratification to provide a resource for investigating potential population-specific gene interactions. Conclusion The LDGIdb is a user-friendly resource that integrates non-physical SNP linkages and SNP-disease associations for studies of gene interactions in human diseases. With the help of the LDGIdb, it is plausible to infer population-specific SNP linkages for more focused studies, an avenue that is potentially important for pharmacogenetics. Moreover, by referring to disease-association information such as the GWAS data, the LDGIdb may help identify previously uncharacterized disease-associated gene interactions and potentially lead to new discoveries in studies of human diseases. Keywords Gene interaction, SNP, Linkage disequilibrium, Systems biology, Bioinformatics
- Published
- 2012
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.