397 results on '"Morris, Quaid"'
Search Results
352. Contributions of steroidogenic factor 1 to the transcription landscape of Y1 mouse adrenocortical tumor cells.
- Author
-
Schimmer BP, Tsao J, Cordova M, Mostafavi S, Morris Q, and Scheys JO
- Subjects
- Animals, Cell Line, Tumor, Cholesterol Side-Chain Cleavage Enzyme genetics, Chromatin Immunoprecipitation, Clone Cells, Gene Expression Profiling, Gene Expression Regulation, Neoplastic, Gene Knockdown Techniques, Mice, Phenotype, Phosphoproteins genetics, Phosphoproteins metabolism, Promoter Regions, Genetic genetics, Protein Binding, RNA, Messenger genetics, RNA, Messenger metabolism, RNA, Small Interfering metabolism, Steroid 11-beta-Hydroxylase genetics, Steroidogenic Factor 1 genetics, Steroids biosynthesis, Transformation, Genetic, Adrenal Cortex Neoplasms genetics, Steroidogenic Factor 1 metabolism, Transcription, Genetic
- Abstract
The contribution of steroidogenic factor 1 (SF-1) to the gene expression profile of Y1 mouse adrenocortical cells was evaluated using short hairpin RNAs to knockdown SF-1. The reduced level of SF-1 RNA was associated with global changes that affected the accumulation of more than 2000 transcripts. Among the down-regulated transcripts were several with functions in steroidogenesis that were affected to different degrees--i.e., Mc2r>Scarb1>Star≥Hsd3b1>Cyp11b1. For Star and Cyp11b1, the different levels of expression correlated with the amount of residual SF-1 bound to the proximal promoter regions. The knockdown of SF-1 did not affect the accumulation of Cyp11a1 transcripts even though the amount of SF-1 bound to the proximal promoter of the gene was reduced to background levels. Our results indicate that transcripts with functions in steroidogenesis vary in their dependence on SF-1 for constitutive expression. On a more global scale, SF-1 knockdown affects the accumulation of a large number of transcripts, most of which are not recognizably involved in steroid hormone biosynthesis., (Crown Copyright © 2010. Published by Elsevier Ireland Ltd. All rights reserved.)
- Published
- 2011
- Full Text
- View/download PDF
353. Genome-wide analysis of alternative splicing in Caenorhabditis elegans.
- Author
-
Ramani AK, Calarco JA, Pan Q, Mavandadi S, Wang Y, Nelson AC, Lee LJ, Morris Q, Blencowe BJ, Zhen M, and Fraser AG
- Subjects
- Animals, Databases, Genetic, Exons genetics, Female, Gene Expression Profiling, Genome-Wide Association Study, Male, Molecular Sequence Data, Oligonucleotide Array Sequence Analysis, Software, Alternative Splicing genetics, Caenorhabditis elegans genetics
- Abstract
Alternative splicing (AS) plays a crucial role in the diversification of gene function and regulation. Consequently, the systematic identification and characterization of temporally regulated splice variants is of critical importance to understanding animal development. We have used high-throughput RNA sequencing and microarray profiling to analyze AS in C. elegans across various stages of development. This analysis identified thousands of novel splicing events, including hundreds of developmentally regulated AS events. To make these data easily accessible and informative, we constructed the C. elegans Splice Browser, a web resource in which researchers can mine AS events of interest and retrieve information about their relative levels and regulation across development. The data presented in this study, along with the Splice Browser, provide the most comprehensive set of annotated splice variants in C. elegans to date, and are therefore expected to facilitate focused, high resolution in vivo functional assays of AS function.
- Published
- 2011
- Full Text
- View/download PDF
354. Predicting node characteristics from molecular networks.
- Author
-
Mostafavi S, Goldenberg A, and Morris Q
- Subjects
- Databases, Genetic, Databases, Protein, Predictive Value of Tests, Proteins genetics, Proteins metabolism, Algorithms, Computational Biology methods, Gene Regulatory Networks, Protein Interaction Maps genetics, Software
- Abstract
A large number of genome-scale networks, including protein-protein and genetic interaction networks, are now available for several organisms. In parallel, many studies have focused on analyzing, characterizing, and modeling these networks. Beyond investigating the topological characteristics such as degree distribution, clustering coefficient, and average shortest-path distance, another area of particular interest is the prediction of nodes (genes) with a given characteristic (labels) - for example prediction of genes that cause a particular phenotype or have a given function. In this chapter, we describe methods and algorithms for predicting node labels from network-based datasets with an emphasis on label propagation algorithms (LPAs) and their relation to local neighborhood methods.
- Published
- 2011
- Full Text
- View/download PDF
355. Computational prediction of intronic microRNA targets using host gene expression reveals novel regulatory mechanisms.
- Author
-
Radfar MH, Wong W, and Morris Q
- Subjects
- DNA, Intergenic genetics, Databases, Genetic, Gene Expression Profiling, Gene Regulatory Networks genetics, Humans, MicroRNAs metabolism, ROC Curve, Computational Biology methods, Gene Expression Regulation, Introns genetics, MicroRNAs genetics
- Abstract
Approximately half of known human miRNAs are located in the introns of protein coding genes. Some of these intronic miRNAs are only expressed when their host gene is and, as such, their steady state expression levels are highly correlated with those of the host gene's mRNA. Recently host gene expression levels have been used to predict the targets of intronic miRNAs by identifying other mRNAs that they have consistent negative correlation with. This is a potentially powerful approach because it allows a large number of expression profiling studies to be used but needs refinement because mRNAs can be targeted by multiple miRNAs and not all intronic miRNAs are co-expressed with their host genes.Here we introduce InMiR, a new computational method that uses a linear-Gaussian model to predict the targets of intronic miRNAs based on the expression profiles of their host genes across a large number of datasets. Our method recovers nearly twice as many true positives at the same fixed false positive rate as a comparable method that only considers correlations. Through an analysis of 140 Affymetrix datasets from Gene Expression Omnibus, we build a network of 19,926 interactions among 57 intronic miRNAs and 3,864 targets. InMiR can also predict which host genes have expression profiles that are good surrogates for those of their intronic miRNAs. Host genes that InMiR predicts are bad surrogates contain significantly more miRNA target sites in their 3' UTRs and are significantly more likely to have predicted Pol II and Pol III promoters in their introns.We provide a dataset of 1,935 predicted mRNA targets for 22 intronic miRNAs. These prediction are supported both by sequence features and expression. By combining our results with previous reports, we distinguish three classes of intronic miRNAs: Those that are tightly regulated with their host gene; those that are likely to be expressed from the same promoter but whose host gene is highly regulated by miRNAs; and those likely to have independent promoters.
- Published
- 2011
- Full Text
- View/download PDF
356. RBPDB: a database of RNA-binding specificities.
- Author
-
Cook KB, Kazan H, Zuberi K, Morris Q, and Hughes TR
- Subjects
- Animals, Binding Sites, Caenorhabditis elegans Proteins chemistry, Caenorhabditis elegans Proteins metabolism, Drosophila Proteins chemistry, Drosophila Proteins metabolism, Humans, Mice, Protein Structure, Tertiary, RNA, Messenger chemistry, RNA, Messenger metabolism, RNA-Binding Proteins chemistry, Sequence Analysis, RNA, Databases, Protein, RNA-Binding Proteins metabolism
- Abstract
The RNA-Binding Protein DataBase (RBPDB) is a collection of experimental observations of RNA-binding sites, both in vitro and in vivo, manually curated from primary literature. To build RBPDB, we performed a literature search for experimental binding data for all RNA-binding proteins (RBPs) with known RNA-binding domains in four metazoan species (human, mouse, fly and worm). In total, RPBDB contains binding data on 272 RBPs, including 71 that have motifs in position weight matrix format, and 36 sets of sequences of in vivo-bound transcripts from immunoprecipitation experiments. The database is accessible by a web interface which allows browsing by domain or by organism, searching and export of records, and bulk data downloads. Users can also use RBPDB to scan sequences for RBP-binding sites. RBPDB is freely available, without registration at http://rbpdb.ccbr.utoronto.ca/.
- Published
- 2011
- Full Text
- View/download PDF
357. Cytoscape Web: an interactive web-based network browser.
- Author
-
Lopes CT, Franz M, Kazi F, Donaldson SL, Morris Q, and Bader GD
- Subjects
- Internet, Software
- Abstract
Unlabelled: Cytoscape Web is a web-based network visualization tool-modeled after Cytoscape-which is open source, interactive, customizable and easily integrated into web sites. Multiple file exchange formats can be used to load data into Cytoscape Web, including GraphML, XGMML and SIF., Availability and Implementation: Cytoscape Web is implemented in Flex/ActionScript with a JavaScript API and is freely available at http://cytoscapeweb.cytoscape.org/.
- Published
- 2010
- Full Text
- View/download PDF
358. Fast integration of heterogeneous data sources for predicting gene function with limited annotation.
- Author
-
Mostafavi S and Morris Q
- Subjects
- Algorithms, Animals, Data Collection, Databases, Genetic, Gene Expression Profiling methods, Humans, Linear Models, Mice, Genes physiology, Genomics methods
- Abstract
Motivation: Many algorithms that integrate multiple functional association networks for predicting gene function construct a composite network as a weighted sum of the individual networks and then use the composite network to predict gene function. The weight assigned to an individual network represents the usefulness of that network in predicting a given gene function. However, because many categories of gene function have a small number of annotations, the process of assigning these network weights is prone to overfitting., Results: Here, we address this problem by proposing a novel approach to combining multiple functional association networks. In particular, we present a method where network weights are simultaneously optimized on sets of related function categories. The method is simpler and faster than existing approaches. Further, we show that it produces composite networks with improved function prediction accuracy using five example species (yeast, mouse, fly, Esherichia coli and human)., Availability: Networks and code are available from: http://morrislab.med.utoronto.ca/sara/SW
- Published
- 2010
- Full Text
- View/download PDF
359. The GeneMANIA prediction server: biological network integration for gene prioritization and predicting gene function.
- Author
-
Warde-Farley D, Donaldson SL, Comes O, Zuberi K, Badrawi R, Chao P, Franz M, Grouios C, Kazi F, Lopes CT, Maitland A, Mostafavi S, Montojo J, Shao Q, Wright G, Bader GD, and Morris Q
- Subjects
- Algorithms, Animals, Gene Regulatory Networks, Genomics, Humans, Internet, Mice, Genes physiology, Software
- Abstract
GeneMANIA (http://www.genemania.org) is a flexible, user-friendly web interface for generating hypotheses about gene function, analyzing gene lists and prioritizing genes for functional assays. Given a query list, GeneMANIA extends the list with functionally similar genes that it identifies using available genomics and proteomics data. GeneMANIA also reports weights that indicate the predictive value of each selected data set for the query. Six organisms are currently supported (Arabidopsis thaliana, Caenorhabditis elegans, Drosophila melanogaster, Mus musculus, Homo sapiens and Saccharomyces cerevisiae) and hundreds of data sets have been collected from GEO, BioGRID, Pathway Commons and I2D, as well as organism-specific functional genomics data sets. Users can select arbitrary subsets of the data sets associated with an organism to perform their analyses and can upload their own data sets to analyze. The GeneMANIA algorithm performs as well or better than other gene function prediction methods on yeast and mouse benchmarks. The high accuracy of the GeneMANIA prediction algorithm, an intuitive user interface and large database make GeneMANIA a useful tool for any biologist.
- Published
- 2010
- Full Text
- View/download PDF
360. RNAcontext: a new method for learning the sequence and structure binding preferences of RNA-binding proteins.
- Author
-
Kazan H, Ray D, Chan ET, Hughes TR, and Morris Q
- Subjects
- Algorithms, Amino Acid Motifs, Area Under Curve, Base Sequence, Databases, Protein, Models, Genetic, Models, Statistical, Nucleic Acid Conformation, Protein Binding, RNA, Messenger chemistry, RNA, Messenger metabolism, Amino Acid Sequence, Binding Sites, Protein Conformation, RNA-Binding Proteins chemistry, RNA-Binding Proteins genetics, RNA-Binding Proteins metabolism
- Abstract
Metazoan genomes encode hundreds of RNA-binding proteins (RBPs). These proteins regulate post-transcriptional gene expression and have critical roles in numerous cellular processes including mRNA splicing, export, stability and translation. Despite their ubiquity and importance, the binding preferences for most RBPs are not well characterized. In vitro and in vivo studies, using affinity selection-based approaches, have successfully identified RNA sequence associated with specific RBPs; however, it is difficult to infer RBP sequence and structural preferences without specifically designed motif finding methods. In this study, we introduce a new motif-finding method, RNAcontext, designed to elucidate RBP-specific sequence and structural preferences with greater accuracy than existing approaches. We evaluated RNAcontext on recently published in vitro and in vivo RNA affinity selected data and demonstrate that RNAcontext identifies known binding preferences for several control proteins including HuR, PTB, and Vts1p and predicts new RNA structure preferences for SF2/ASF, RBM4, FUSIP1 and SLM2. The predicted preferences for SF2/ASF are consistent with its recently reported in vivo binding sites. RNAcontext is an accurate and efficient motif finding method ideally suited for using large-scale RNA-binding affinity datasets to determine the relative binding preferences of RBPs for a wide range of RNA sequences and structures.
- Published
- 2010
- Full Text
- View/download PDF
361. Predicting in vivo binding sites of RNA-binding proteins using mRNA secondary structure.
- Author
-
Li X, Quon G, Lipshitz HD, and Morris Q
- Subjects
- 3' Untranslated Regions genetics, Animals, Binding Sites, Consensus Sequence, Conserved Sequence, Dinucleotide Repeats, Drosophila genetics, Drosophila metabolism, Drosophila Proteins chemistry, Drosophila Proteins genetics, Drosophila Proteins metabolism, Humans, Oligonucleotide Array Sequence Analysis, Predictive Value of Tests, RNA, Messenger chemistry, RNA-Binding Proteins genetics, RNA, Messenger genetics, RNA-Binding Proteins chemistry, RNA-Binding Proteins metabolism, Transcription, Genetic
- Abstract
While many RNA-binding proteins (RBPs) bind RNA in a sequence-specific manner, their sequence preferences alone do not distinguish known target RNAs from other potential targets that are coexpressed and contain the same sequence motifs. Recently, the mRNA targets of dozens of RNA-binding proteins have been identified, facilitating a systematic study of the features of target transcripts. Using these data, we demonstrate that calculating the predicted structural accessibility of a putative RBP binding site allows one to significantly improve the accuracy of predicting in vivo binding for the majority of sequence-specific RBPs. In our new in silico approach, accessibility is predicted based solely on the mRNA sequence without consideration of the locations of bound trans-factors; as such, our results suggest a greater than previously anticipated role for intrinsic mRNA secondary structure in determining RBP binding target preference. Target site accessibility aids in predicting target transcripts and the binding sites for RBPs with a range of RNA-binding domains and subcellular functions. Based on this work, we introduce a new motif-finding algorithm that identifies accessible sequence-specific RBP motifs from in vivo binding data.
- Published
- 2010
- Full Text
- View/download PDF
362. The genetic landscape of a cell.
- Author
-
Costanzo M, Baryshnikova A, Bellay J, Kim Y, Spear ED, Sevier CS, Ding H, Koh JL, Toufighi K, Mostafavi S, Prinz J, St Onge RP, VanderSluis B, Makhnevych T, Vizeacoumar FJ, Alizadeh S, Bahr S, Brost RL, Chen Y, Cokol M, Deshpande R, Li Z, Lin ZY, Liang W, Marback M, Paw J, San Luis BJ, Shuteriqi E, Tong AH, van Dyk N, Wallace IM, Whitney JA, Weirauch MT, Zhong G, Zhu H, Houry WA, Brudno M, Ragibizadeh S, Papp B, Pál C, Roth FP, Giaever G, Nislow C, Troyanskaya OG, Bussey H, Bader GD, Gingras AC, Morris QD, Kim PM, Kaiser CA, Myers CL, Andrews BJ, and Boone C
- Subjects
- Computational Biology, Gene Duplication, Gene Expression Regulation, Fungal, Genes, Fungal, Genetic Fitness, Metabolic Networks and Pathways, Mutation, Protein Interaction Mapping, Saccharomyces cerevisiae physiology, Saccharomyces cerevisiae Proteins genetics, Gene Regulatory Networks, Genome, Fungal, Saccharomyces cerevisiae genetics, Saccharomyces cerevisiae metabolism, Saccharomyces cerevisiae Proteins metabolism
- Abstract
A genome-scale genetic interaction map was constructed by examining 5.4 million gene-gene pairs for synthetic genetic interactions, generating quantitative genetic interaction profiles for approximately 75% of all genes in the budding yeast, Saccharomyces cerevisiae. A network based on genetic interaction profiles reveals a functional map of the cell in which genes of similar biological processes cluster together in coherent subsets, and highly correlated profiles delineate specific pathways to define gene function. The global network identifies functional cross-connections between all bioprocesses, mapping a cellular wiring diagram of pleiotropy. Genetic interaction degree correlated with a number of different gene attributes, which may be informative about genetic network hubs in other organisms. We also demonstrate that extensive and unbiased mapping of the genetic landscape provides a key for interpretation of chemical-genetic interactions and drug target identification.
- Published
- 2010
- Full Text
- View/download PDF
363. Predicting the target genes of intronic microRNAs using large-scale gene expression data.
- Author
-
Radfar M, Wong W, and Morris QD
- Subjects
- Algorithms, Binding Sites, Computer Simulation, Protein Binding, Gene Expression Profiling methods, MicroRNAs genetics, Models, Genetic, Proteins genetics, RNA, Messenger genetics
- Abstract
Current microRNA target prediction techniques provide long lists of putative miRNA-target interactions, many of which are false positives. The goal of this paper is to identify functional targets in these lists based on biological evidence obtained from the expression profiles of the host genes of intronic miRNAs and those of their targets. We propose a scoring strategy for each interaction based on the combinatorial effect of miRNAs. In particular, the change in expression level of a target gene is expressed in terms of a linear combination of the host gene data which are used as surrogates for expression data of the intronic miRNAs. The parameters of this linear model give an estimate of the contribution of each intronic miRNA in down-regulating the target gene. The experimental results show that our prediction technique is able to detect several functional interactions. In addition, the analysis of mRNA microarrays after intronic miRNA transfection confirms that significantly down-regulated genes are among targets detected by our technique.
- Published
- 2010
- Full Text
- View/download PDF
364. ISOLATE: a computational strategy for identifying the primary origin of cancers using high-throughput sequencing.
- Author
-
Quon G and Morris Q
- Subjects
- Gene Expression Profiling methods, Computational Biology methods, Neoplasms genetics, Oligonucleotide Array Sequence Analysis methods, Software
- Abstract
Motivation: One of the most deadly cancer diagnoses is the carcinoma of unknown primary origin. Without the knowledge of the site of origin, treatment regimens are limited in their specificity and result in high mortality rates. Though supervised classification methods have been developed to predict the site of origin based on gene expression data, they require large numbers of previously classified tumors for training, in part because they do not account for sample heterogeneity, which limits their application to well-studied cancers., Results: We present ISOLATE, a new statistical method that simultaneously predicts the primary site of origin of cancers and addresses sample heterogeneity, while taking advantage of new high-throughput sequencing technology that promises to bring higher accuracy and reproducibility to gene expression profiling experiments. ISOLATE makes predictions de novo, without having seen any training expression profiles of cancers with identified origin. Compared with previous methods, ISOLATE is able to predict the primary site of origin, de-convolve and remove the effect of sample heterogeneity and identify differentially expressed genes with higher accuracy, across both synthetic and clinical datasets. Methods such as ISOLATE are invaluable tools for clinicians faced with carcinomas of unknown primary origin., Availability: ISOLATE is available for download at: http://morrislab.med.utoronto.ca/software, Contact: gerald.quon@utoronto.ca; quaid.morris@utoronto.ca, Supplementary Information: Supplementary data are available at Bioinformatics online.
- Published
- 2009
- Full Text
- View/download PDF
365. Rapid and systematic analysis of the RNA recognition specificities of RNA-binding proteins.
- Author
-
Ray D, Kazan H, Chan ET, Peña Castillo L, Chaudhry S, Talukder S, Blencowe BJ, Morris Q, and Hughes TR
- Subjects
- Animals, Base Sequence, Binding Sites genetics, Databases, Nucleic Acid, Genome, Molecular Sequence Data, RNA chemistry, RNA genetics, RNA-Binding Proteins chemistry, RNA-Binding Proteins genetics, ROC Curve, Substrate Specificity, Oligonucleotide Array Sequence Analysis methods, RNA metabolism, RNA-Binding Proteins metabolism
- Abstract
Metazoan genomes encode hundreds of RNA-binding proteins (RBPs) but RNA-binding preferences for relatively few RBPs have been well defined. Current techniques for determining RNA targets, including in vitro selection and RNA co-immunoprecipitation, require significant time and labor investment. Here we introduce RNAcompete, a method for the systematic analysis of RNA binding specificities that uses a single binding reaction to determine the relative preferences of RBPs for short RNAs that contain a complete range of k-mers in structured and unstructured RNA contexts. We tested RNAcompete by analyzing nine diverse RBPs (HuR, Vts1, FUSIP1, PTB, U1A, SF2/ASF, SLM2, RBM4 and YB1). RNAcompete identified expected and previously unknown RNA binding preferences. Using in vitro and in vivo binding data, we demonstrate that preferences for individual 7-mers identified by RNAcompete are a more accurate representation of binding activity than are conventional motif models. We anticipate that RNAcompete will be a valuable tool for the study of RNA-protein interactions.
- Published
- 2009
- Full Text
- View/download PDF
366. Diversity and complexity in DNA recognition by transcription factors.
- Author
-
Badis G, Berger MF, Philippakis AA, Talukder S, Gehrke AR, Jaeger SA, Chan ET, Metzler G, Vedenko A, Chen X, Kuznetsov H, Wang CF, Coburn D, Newburger DE, Morris Q, Hughes TR, and Bulyk ML
- Subjects
- Amino Acid Motifs, Amino Acid Sequence, Animals, Base Sequence, Binding Sites, DNA chemistry, Electrophoretic Mobility Shift Assay, Gene Expression Regulation, Gene Regulatory Networks, Humans, Mice, Protein Array Analysis, Protein Binding, Protein Structure, Tertiary, Recombinant Fusion Proteins chemistry, Recombinant Fusion Proteins metabolism, DNA metabolism, Transcription Factors chemistry, Transcription Factors metabolism
- Abstract
Sequence preferences of DNA binding proteins are a primary mechanism by which cells interpret the genome. Despite the central importance of these proteins in physiology, development, and evolution, comprehensive DNA binding specificities have been determined experimentally for only a few proteins. Here, we used microarrays containing all 10-base pair sequences to examine the binding specificities of 104 distinct mouse DNA binding proteins representing 22 structural classes. Our results reveal a complex landscape of binding, with virtually every protein analyzed possessing unique preferences. Roughly half of the proteins each recognized multiple distinctly different sequence motifs, challenging our molecular understanding of how proteins interact with their DNA binding sites. This complexity in DNA recognition may be important in gene regulation and in the evolution of transcriptional regulatory networks.
- Published
- 2009
- Full Text
- View/download PDF
367. Predicting the binding preference of transcription factors to individual DNA k-mers.
- Author
-
Alleyne TM, Peña-Castillo L, Badis G, Talukder S, Berger MF, Gehrke AR, Philippakis AA, Bulyk ML, Morris QD, and Hughes TR
- Subjects
- Binding Sites, DNA metabolism, Transcription Factors chemistry, Computational Biology methods, DNA chemistry, Sequence Analysis, DNA methods, Transcription Factors metabolism
- Abstract
Motivation: Recognition of specific DNA sequences is a central mechanism by which transcription factors (TFs) control gene expression. Many TF-binding preferences, however, are unknown or poorly characterized, in part due to the difficulty associated with determining their specificity experimentally, and an incomplete understanding of the mechanisms governing sequence specificity. New techniques that estimate the affinity of TFs to all possible k-mers provide a new opportunity to study DNA-protein interaction mechanisms, and may facilitate inference of binding preferences for members of a given TF family when such information is available for other family members., Results: We employed a new dataset consisting of the relative preferences of mouse homeodomains for all eight-base DNA sequences in order to ask how well we can predict the binding profiles of homeodomains when only their protein sequences are given. We evaluated a panel of standard statistical inference techniques, as well as variations of the protein features considered. Nearest neighbour among functionally important residues emerged among the most effective methods. Our results underscore the complexity of TF-DNA recognition, and suggest a rational approach for future analyses of TF families.
- Published
- 2009
- Full Text
- View/download PDF
368. Dynamic modularity in protein interaction networks predicts breast cancer outcome.
- Author
-
Taylor IW, Linding R, Warde-Farley D, Liu Y, Pesquita C, Faria D, Bull S, Pawson T, Morris Q, and Wrana JL
- Subjects
- Algorithms, Computational Biology, Computer Simulation, Data Interpretation, Statistical, Female, Humans, Kaplan-Meier Estimate, Neoplasm Proteins genetics, Neoplasm Proteins metabolism, Prognosis, ROC Curve, Reproducibility of Results, Statistics, Nonparametric, Ubiquitin-Protein Ligases genetics, Ubiquitin-Protein Ligases metabolism, Breast Neoplasms diagnosis, Breast Neoplasms metabolism, Gene Regulatory Networks physiology, Protein Interaction Mapping methods, Signal Transduction physiology
- Abstract
Changes in the biochemical wiring of oncogenic cells drives phenotypic transformations that directly affect disease outcome. Here we examine the dynamic structure of the human protein interaction network (interactome) to determine whether changes in the organization of the interactome can be used to predict patient outcome. An analysis of hub proteins identified intermodular hub proteins that are co-expressed with their interacting partners in a tissue-restricted manner and intramodular hub proteins that are co-expressed with their interacting partners in all or most tissues. Substantial differences in biochemical structure were observed between the two types of hubs. Signaling domains were found more often in intermodular hub proteins, which were also more frequently associated with oncogenesis. Analysis of two breast cancer patient cohorts revealed that altered modularity of the human interactome may be useful as an indicator of breast cancer prognosis.
- Published
- 2009
- Full Text
- View/download PDF
369. Application of an integrated physical and functional screening approach to identify inhibitors of the Wnt pathway.
- Author
-
Miller BW, Lau G, Grouios C, Mollica E, Barrios-Rodiles M, Liu Y, Datti A, Morris Q, Wrana JL, and Attisano L
- Subjects
- Adaptor Proteins, Signal Transducing, Animals, Axin Protein, Calcium-Binding Proteins, Carrier Proteins metabolism, Cell Line, Humans, Mice, Models, Biological, Protein Binding, Protein Interaction Mapping, RNA Interference, Repressor Proteins metabolism, Ubiquitin-Conjugating Enzymes metabolism, High-Throughput Screening Assays methods, Signal Transduction, Wnt Proteins antagonists & inhibitors
- Abstract
Large-scale proteomic approaches have been used to study signaling pathways. However, identification of biologically relevant hits from a single screen remains challenging due to limitations inherent in each individual approach. To overcome these limitations, we implemented an integrated, multi-dimensional approach and used it to identify Wnt pathway modulators. The LUMIER protein-protein interaction mapping method was used in conjunction with two functional screens that examined the effect of overexpression and siRNA-mediated gene knockdown on Wnt signaling. Meta-analysis of the three data sets yielded a combined pathway score (CPS) for each tested component, a value reflecting the likelihood that an individual protein is a Wnt pathway regulator. We characterized the role of two proteins with high CPSs, Ube2m and Nkd1. We show that Ube2m interacts with and modulates beta-catenin stability, and that the antagonistic effect of Nkd1 on Wnt signaling requires interaction with Axin, itself a negative pathway regulator. Thus, integrated physical and functional mapping in mammalian cells can identify signaling components with high confidence and provides unanticipated insights into pathway regulators.
- Published
- 2009
- Full Text
- View/download PDF
370. Conservation of core gene expression in vertebrate tissues.
- Author
-
Chan ET, Quon GT, Chua G, Babak T, Trochesset M, Zirngibl RA, Aubin J, Ratcliffe MJ, Wilde A, Brudno M, Morris QD, and Hughes TR
- Subjects
- Animals, Anura, Base Sequence, Chickens, Conserved Sequence genetics, DNA analysis, DNA genetics, Evolution, Molecular, Gene Expression Profiling, Humans, Mice, Sequence Alignment, Sequence Analysis, DNA, Tetraodontiformes, Transcription Factors biosynthesis, Transcription Factors genetics, Gene Expression Regulation, Vertebrates genetics, Vertebrates metabolism
- Abstract
Background: Vertebrates share the same general body plan and organs, possess related sets of genes, and rely on similar physiological mechanisms, yet show great diversity in morphology, habitat and behavior. Alteration of gene regulation is thought to be a major mechanism in phenotypic variation and evolution, but relatively little is known about the broad patterns of conservation in gene expression in non-mammalian vertebrates., Results: We measured expression of all known and predicted genes across twenty tissues in chicken, frog and pufferfish. By combining the results with human and mouse data and considering only ten common tissues, we have found evidence of conserved expression for more than a third of unique orthologous genes. We find that, on average, transcription factor gene expression is neither more nor less conserved than that of other genes. Strikingly, conservation of expression correlates poorly with the amount of conserved nonexonic sequence, even using a sequence alignment technique that accounts for non-collinearity in conserved elements. Many genes show conserved human/fish expression despite having almost no nonexonic conserved primary sequence., Conclusions: There are clearly strong evolutionary constraints on tissue-specific gene expression. A major challenge will be to understand the precise mechanisms by which many gene expression patterns remain similar despite extensive cis-regulatory restructuring.
- Published
- 2009
- Full Text
- View/download PDF
371. Post-transcriptional gene regulation: RNA-protein interactions, RNA processing, mRNA stability and localization.
- Author
-
Blencowe B, Brenner S, Hughes T, and Morris Q
- Subjects
- Biometry, Data Interpretation, Statistical, Databases, Genetic, Databases, Protein, Humans, Models, Genetic, RNA, Messenger genetics, RNA, Messenger metabolism, RNA-Binding Proteins chemistry, RNA-Binding Proteins genetics, RNA-Binding Proteins metabolism, RNA Processing, Post-Transcriptional, RNA Stability
- Abstract
The goal of our workshop is to introduce some recent work in the area of post-transcriptional regulation to a wider computational community, discuss some of the unique computational problems faced in this area, and to present some preliminary solutions to these problems. In particular, we will focus on emerging computational and large-scale experimental strategies (e.g. microarray and deep sequencing) for investigating aspects of gene regulation at the post-transcriptional level, with an emphasis on the identification and characterization of the cis- and trans-acting RNA and protein components involved. We will also be exploring new developments in computational methods to detect and characterize cis-regulatory signals encoded in mRNAs.
- Published
- 2009
372. Variation in homeodomain DNA binding revealed by high-resolution analysis of sequence preferences.
- Author
-
Berger MF, Badis G, Gehrke AR, Talukder S, Philippakis AA, Peña-Castillo L, Alleyne TM, Mnaimneh S, Botvinnik OB, Chan ET, Khalid F, Zhang W, Newburger D, Jaeger SA, Morris QD, Bulyk ML, and Hughes TR
- Subjects
- Animals, Base Sequence, Computational Biology, Conserved Sequence, DNA metabolism, Evolution, Molecular, Homeodomain Proteins metabolism, Mice, Models, Molecular, Protein Binding, Transcription Factors chemistry, Transcription Factors metabolism, DNA chemistry, Homeodomain Proteins chemistry
- Abstract
Most homeodomains are unique within a genome, yet many are highly conserved across vast evolutionary distances, implying strong selection on their precise DNA-binding specificities. We determined the binding preferences of the majority (168) of mouse homeodomains to all possible 8-base sequences, revealing rich and complex patterns of sequence specificity and showing that there are at least 65 distinct homeodomain DNA-binding activities. We developed a computational system that successfully predicts binding sites for homeodomain proteins as distant from mouse as Drosophila and C. elegans, and we infer full 8-mer binding profiles for the majority of known animal homeodomains. Our results provide an unprecedented level of resolution in the analysis of this simple domain structure and suggest that variation in sequence recognition may be a factor in its functional diversity and evolutionary success.
- Published
- 2008
- Full Text
- View/download PDF
373. A critical assessment of Mus musculus gene function prediction using integrated genomic evidence.
- Author
-
Peña-Castillo L, Tasan M, Myers CL, Lee H, Joshi T, Zhang C, Guan Y, Leone M, Pagnani A, Kim WK, Krumpelman C, Tian W, Obozinski G, Qi Y, Mostafavi S, Lin GN, Berriz GF, Gibbons FD, Lanckriet G, Qiu J, Grant C, Barutcuoglu Z, Hill DP, Warde-Farley D, Grouios C, Ray D, Blake JA, Deng M, Jordan MI, Noble WS, Morris Q, Klein-Seetharaman J, Bar-Joseph Z, Chen T, Sun F, Troyanskaya OG, Marcotte EM, Xu D, Hughes TR, and Roth FP
- Subjects
- Animals, Mice metabolism, Algorithms, Mice genetics, Proteins genetics, Proteins metabolism
- Abstract
Background: Several years after sequencing the human genome and the mouse genome, much remains to be discovered about the functions of most human and mouse genes. Computational prediction of gene function promises to help focus limited experimental resources on the most likely hypotheses. Several algorithms using diverse genomic data have been applied to this task in model organisms; however, the performance of such approaches in mammals has not yet been evaluated., Results: In this study, a standardized collection of mouse functional genomic data was assembled; nine bioinformatics teams used this data set to independently train classifiers and generate predictions of function, as defined by Gene Ontology (GO) terms, for 21,603 mouse genes; and the best performing submissions were combined in a single set of predictions. We identified strengths and weaknesses of current functional genomic data sets and compared the performance of function prediction algorithms. This analysis inferred functions for 76% of mouse genes, including 5,000 currently uncharacterized genes. At a recall rate of 20%, a unified set of predictions averaged 41% precision, with 26% of GO terms achieving a precision better than 90%., Conclusion: We performed a systematic evaluation of diverse, independently developed computational approaches for predicting gene function from heterogeneous data sources in mammals. The results show that currently available data for mammals allows predictions with both breadth and accuracy. Importantly, many highly novel predictions emerge for the 38% of mouse genes that remain uncharacterized.
- Published
- 2008
- Full Text
- View/download PDF
374. GeneMANIA: a real-time multiple association network integration algorithm for predicting gene function.
- Author
-
Mostafavi S, Ray D, Warde-Farley D, Grouios C, and Morris Q
- Subjects
- Animals, Computer Communication Networks, Genomics, Mice metabolism, Proteomics, Saccharomyces cerevisiae genetics, Time Factors, Algorithms, Mice genetics, Proteins genetics, Proteins metabolism
- Abstract
Background: Most successful computational approaches for protein function prediction integrate multiple genomics and proteomics data sources to make inferences about the function of unknown proteins. The most accurate of these algorithms have long running times, making them unsuitable for real-time protein function prediction in large genomes. As a result, the predictions of these algorithms are stored in static databases that can easily become outdated. We propose a new algorithm, GeneMANIA, that is as accurate as the leading methods, while capable of predicting protein function in real-time., Results: We use a fast heuristic algorithm, derived from ridge regression, to integrate multiple functional association networks and predict gene function from a single process-specific network using label propagation. Our algorithm is efficient enough to be deployed on a modern webserver and is as accurate as, or more so than, the leading methods on the MouseFunc I benchmark and a new yeast function prediction benchmark; it is robust to redundant and irrelevant data and requires, on average, less than ten seconds of computation time on tasks from these benchmarks., Conclusion: GeneMANIA is fast enough to predict gene function on-the-fly while achieving state-of-the-art accuracy. A prototype version of a GeneMANIA-based webserver is available at http://morrislab.med.utoronto.ca/prototype.
- Published
- 2008
- Full Text
- View/download PDF
375. Using expression profiling data to identify human microRNA targets.
- Author
-
Huang JC, Babak T, Corson TW, Chua G, Khan S, Gallie BL, Hughes TR, Blencowe BJ, Frey BJ, and Morris QD
- Subjects
- Base Sequence, Humans, Molecular Sequence Data, Gene Expression Profiling methods, Gene Targeting methods, MicroRNAs genetics, Oligonucleotide Array Sequence Analysis methods, Sequence Analysis, RNA methods
- Abstract
We demonstrate that paired expression profiles of microRNAs (miRNAs) and mRNAs can be used to identify functional miRNA-target relationships with high precision. We used a Bayesian data analysis algorithm, GenMiR++, to identify a network of 1,597 high-confidence target predictions for 104 human miRNAs, which was supported by RNA expression data across 88 tissues and cell types, sequence complementarity and comparative genomics data. We experimentally verified our predictions by investigating the result of let-7b downregulation in retinoblastoma using quantitative reverse transcriptase (RT)-PCR and microarray profiling: some of our verified let-7b targets include CDC25A and BCL7A. Compared to sequence-based predictions, our high-scoring GenMiR++ predictions had much more consistent Gene Ontology annotations and were more accurate predictors of which mRNA levels respond to changes in let-7b levels.
- Published
- 2007
- Full Text
- View/download PDF
376. RankMotif++: a motif-search algorithm that accounts for relative ranks of K-mers in binding transcription factors.
- Author
-
Chen X, Hughes TR, and Morris Q
- Subjects
- Amino Acid Motifs, Base Sequence, Binding Sites, Molecular Sequence Data, Protein Binding, Algorithms, Sequence Analysis, DNA methods, Transcription Factors chemistry, Transcription Factors genetics
- Abstract
Motivation: The sequence specificity of DNA-binding proteins is typically represented as a position weight matrix in which each base position contributes independently to relative affinity. Assessment of the accuracy and broad applicability of this representation has been limited by the lack of extensive DNA-binding data. However, new microarray techniques, in which preferences for all possible K-mers are measured, enable a broad comparison of both motif representation and methods for motif discovery. Here, we consider the problem of accounting for all of the binding data in such experiments, rather than the highest affinity binding data. We introduce the RankMotif++, an algorithm designed for finding motifs whenever sequences are associated with a semi-quantitative measure of protein-DNA-binding affinity. RankMotif++ learns motif models by maximizing the likelihood of a set of binding preferences under a probabilistic model of how sequence binding affinity translates into binding preference observations. Because RankMotif++ makes few assumptions about the relationship between binding affinity and the semi-quantitative readout, it is applicable to a wide variety of experimental assays of DNA-binding preference., Results: By several criteria, RankMotif++ predicts binding affinity better than two widely used motif finding algorithms (MDScan, MatrixREDUCE) or more recently developed algorithms (PREGO, Seed and Wobble), and its performance is comparable to a motif model that separately assigns affinities to 8-mers. Our results validate the PWM model and provide an approximation of the precision and recall that can be expected in a genomic scan., Availability: RankMotif++ is available upon request., Supplementary Information: Supplementary data are available at Bioinformatics online.
- Published
- 2007
- Full Text
- View/download PDF
377. Prediction and testing of novel transcriptional networks regulating embryonic stem cell self-renewal and commitment.
- Author
-
Walker E, Ohishi M, Davey RE, Zhang W, Cassar PA, Tanaka TS, Der SD, Morris Q, Hughes TR, Zandstra PW, and Stanford WL
- Subjects
- Cell Lineage, DNA-Binding Proteins genetics, Electroporation, HMGB Proteins genetics, Humans, Octamer Transcription Factor-3 genetics, Pluripotent Stem Cells cytology, Polymerase Chain Reaction, RNA, Small Interfering, SOXB1 Transcription Factors, Transcription Factors genetics, Embryonic Stem Cells cytology, Transcription, Genetic
- Abstract
Stem cell fate is governed by the integration of intrinsic and extrinsic positive and negative signals upon inherent transcriptional networks. To identify novel embryonic stem cell (ESC) regulators and assemble transcriptional networks controlling ESC fate, we performed temporal expression microarray analyses of ESCs after the initiation of commitment and integrated these data with known genome-wide transcription factor binding. Effects of forced under- or overexpression of predicted novel regulators, defined as differentially expressed genes with potential binding sites for known regulators of pluripotency, demonstrated greater than 90% correspondence with predicted function, as assessed by functional and high-content assays of self-renewal. We next assembled 43 theoretical transcriptional networks in ESCs, 82% (23 out of 28 tested) of which were supported by analysis of genome-wide expression in Oct4 knockdown cells. By using this integrative approach, we have formulated novel networks describing gene repression of key developmental regulators in undifferentiated ESCs and successfully predicted the outcomes of genetic manipulation of these networks.
- Published
- 2007
- Full Text
- View/download PDF
378. Bayesian inference of MicroRNA targets from sequence and expression data.
- Author
-
Huang JC, Morris QD, and Frey BJ
- Subjects
- Animals, Humans, Bayes Theorem, Gene Expression Profiling trends, Gene Targeting trends, MicroRNAs genetics, Models, Genetic, Sequence Analysis, DNA trends, Sequence Analysis, RNA trends
- Abstract
MicroRNAs (miRNAs) regulate a large proportion of mammalian genes by hybridizing to targeted messenger RNAs (mRNAs) and down-regulating their translation into protein. Although much work has been done in the genome-wide computational prediction of miRNA genes and their target mRNAs, an open question is how to efficiently obtain functional miRNA targets from a large number of candidate miRNA targets predicted by existing computational algorithms. In this paper, we propose a novel Bayesian model and learning algorithm, GenMiR++ (Generative model for miRNA regulation), that accounts for patterns of gene expression using miRNA expression data and a set of candidate miRNA targets. A set of high-confidence functional miRNA targets are then obtained from the data using a Bayesian learning algorithm. Our model scores 467 high-confidence miRNA targets out of 1,770 targets obtained from TargetScanS in mouse at a false detection rate of 2.5%: several confirmed miRNA targets appear in our high-confidence set, such as the interactions between miR-92 and the signal transduction gene MAP2K4, as well as the relationship between miR-16 and BCL2, an anti-apoptotic gene which has been implicated in chronic lymphocytic leukemia. We present results on the robustness of our model showing that our learning algorithm is not sensitive to various perturbations of the data. Our high-confidence targets represent a significant increase in the number of miRNA targets and represent a starting point for a global understanding of gene regulation.
- Published
- 2007
- Full Text
- View/download PDF
379. A genome-wide assessment of adrenocorticotropin action in the Y1 mouse adrenal tumor cell line.
- Author
-
Schimmer BP, Cordova M, Cheng H, Tsao A, and Morris Q
- Subjects
- Adrenal Gland Neoplasms metabolism, Animals, Cell Line, Tumor, Humans, Mice, Oligonucleotide Array Sequence Analysis, Adrenal Gland Neoplasms genetics, Adrenocorticotropic Hormone metabolism, Genome
- Abstract
This report summarizes the genome-wide effects of ACTH on transcript accumulation in mouse adrenal Y1 cells and the relative contributions of the cAMP-, protein kinase C- and Ca(2+)-dependent signaling pathways to these actions of the hormone. ACTH affected the accumulation of 1386 transcripts, a much larger number than previously appreciated. The cAMP signaling pathway accounted for approximately 56% of the ACTH effects whereas the protein kinase C- and Ca(2+)-dependent pathways made smaller contributions to ACTH action. Approximately 38% of the ACTH-affected transcripts could not be assigned to these signaling pathways and thus represent candidates for regulation via other mechanisms. The set of ACTH-regulated transcripts included clusters with functions in steroid metabolism, cell proliferation and alternative splicing. Collectively, our results suggest that Y1 adrenal cells undergo extensive remodeling upon prolonged stimulation with ACTH. The functional implications of ACTH on alternative splicing are explored.
- Published
- 2007
- Full Text
- View/download PDF
380. Identifying transcription factor functions and targets by phenotypic activation.
- Author
-
Chua G, Morris QD, Sopko R, Robinson MD, Ryan O, Chan ET, Frey BJ, Andrews BJ, Boone C, and Hughes TR
- Subjects
- Amino Acid Motifs, Binding Sites, Genetic Techniques, Models, Genetic, Phenotype, Promoter Regions, Genetic, Protein Binding, Saccharomyces cerevisiae metabolism, Saccharomyces cerevisiae Proteins metabolism, Transgenes, Gene Expression Regulation, Fungal, Genetics, Oligonucleotide Array Sequence Analysis methods, Saccharomyces cerevisiae Proteins chemistry, Transcription Factors genetics
- Abstract
Mapping transcriptional regulatory networks is difficult because many transcription factors (TFs) are activated only under specific conditions. We describe a generic strategy for identifying genes and pathways induced by individual TFs that does not require knowledge of their normal activation cues. Microarray analysis of 55 yeast TFs that caused a growth phenotype when overexpressed showed that the majority caused increased transcript levels of genes in specific physiological categories, suggesting a mechanism for growth inhibition. Induced genes typically included established targets and genes with consensus promoter motifs, if known, indicating that these data are useful for identifying potential new target genes and binding sites. We identified the sequence 5'-TCACGCAA as a binding sequence for Hms1p, a TF that positively regulates pseudohyphal growth and previously had no known motif. The general strategy outlined here presents a straightforward approach to discovery of TF activities and mapping targets that could be adapted to any organism with transgenic technology.
- Published
- 2006
- Full Text
- View/download PDF
381. Global profiles of gene expression induced by adrenocorticotropin in Y1 mouse adrenal cells.
- Author
-
Schimmer BP, Cordova M, Cheng H, Tsao A, Goryachev AB, Schimmer AD, and Morris Q
- Subjects
- Alternative Splicing, Animals, Cell Line, Cyclic AMP metabolism, Cyclic AMP-Dependent Protein Kinases metabolism, DNA Primers chemistry, DNA, Complementary metabolism, Down-Regulation, Gene Expression, Genome, Mice, Models, Biological, Mutation, Oligonucleotide Array Sequence Analysis, Protein Kinase C metabolism, RNA, Messenger metabolism, Reverse Transcriptase Polymerase Chain Reaction, Signal Transduction, Steroids metabolism, Up-Regulation, Adrenal Cortex cytology, Adrenocorticotropic Hormone metabolism, Gene Expression Regulation
- Abstract
ACTH regulates the steroidogenic capacity, size, and structural integrity of the adrenal cortex through a series of actions involving changes in gene expression; however, only a limited number of ACTH-regulated genes have been identified, and these only partly account for the global effects of ACTH on the adrenal cortex. In this study, a National Institute on Aging 15K mouse cDNA microarray was used to identify genome-wide changes in gene expression after treatment of Y1 mouse adrenocortical cells with ACTH. ACTH affected the levels of 1275 annotated transcripts, of which 46% were up-regulated. The up-regulated transcripts were enriched for functions associated with steroid biosynthesis and metabolism; the down- regulated transcripts were enriched for functions associated with cell proliferation, nuclear transport and RNA processing, including alternative splicing. A total of 133 different transcripts, i.e. only 10% of the ACTH-affected transcripts, were represented in the categories above; most of these had not been described as ACTH-regulated previously. The contributions of protein kinase A and protein kinase C to these genome-wide effects of ACTH were evaluated in microarray experiments after treatment of Y1 cells and derivative protein kinase A-defective mutants with pharmacological probes of each pathway. Protein kinase A-dependent signaling accounted for 56% of the ACTH effect; protein kinase C-dependent signaling accounted for an additional 6%. These results indicate that ACTH affects the expression profile of Y1 adrenal cells principally through cAMP- and protein kinase A- dependent signaling. The large number of transcripts affected by ACTH anticipates a broader range of actions than previously appreciated.
- Published
- 2006
- Full Text
- View/download PDF
382. Global survey of organ and organelle protein expression in mouse: combined proteomic and transcriptomic profiling.
- Author
-
Kislinger T, Cox B, Kannan A, Chung C, Hu P, Ignatchenko A, Scott MS, Gramolini AO, Morris Q, Hallett MT, Rossant J, Hughes TR, Frey B, and Emili A
- Subjects
- Animals, Cell Nucleus genetics, Cell Nucleus metabolism, Computational Biology, Gene Expression Regulation, Green Fluorescent Proteins metabolism, Mice, Microsomes metabolism, Mitochondria genetics, Mitochondria metabolism, Organ Specificity, Protein Transport, Proteins chemistry, RNA, Messenger genetics, RNA, Messenger metabolism, Reproducibility of Results, Gene Expression Profiling, Organelles metabolism, Proteins genetics, Proteins metabolism, Proteomics, Transcription, Genetic genetics
- Abstract
Organs and organelles represent core biological systems in mammals, but the diversity in protein composition remains unclear. Here, we combine subcellular fractionation with exhaustive tandem mass spectrometry-based shotgun sequencing to examine the protein content of four major organellar compartments (cytosol, membranes [microsomes], mitochondria, and nuclei) in six organs (brain, heart, kidney, liver, lung, and placenta) of the laboratory mouse, Mus musculus. Using rigorous statistical filtering and machine-learning methods, the subcellular localization of 3274 of the 4768 proteins identified was determined with high confidence, including 1503 previously uncharacterized factors, while tissue selectivity was evaluated by comparison to previously reported mRNA expression patterns. This molecular compendium, fully accessible via a searchable web-browser interface, serves as a reliable reference of the expressed tissue and organelle proteomes of a leading model mammal.
- Published
- 2006
- Full Text
- View/download PDF
383. GenRate: a generative model that reveals novel transcripts in genome-tiling microarray data.
- Author
-
Frey BJ, Morris QD, and Hughes TR
- Subjects
- Animals, Computational Biology methods, Databases, Nucleic Acid, Mice, Probability, Software, Chromosomes genetics, Genomics, Models, Genetic, Oligonucleotide Array Sequence Analysis
- Abstract
Genome-wide microarray designs containing millions to hundreds of millions of probes are available for a variety of mammals, including mouse and human. These genome tiling arrays can potentially lead to significant advances in science and medicine, e.g., by indicating new genes and alternative primary and secondary transcripts. While bottom-up pattern matching techniques (e.g., hierarchical clustering) can be used to find gene structures in microarray data, we believe the many interacting hidden variables and complex noise patterns more naturally lead to an analysis based on generative models. We describe a generative model of tiling data and show how the sum-product algorithm can be used to infer hybridization noise, probe sensitivity, new transcripts, and alternative transcripts. The method, called GenRate, maximizes a global scoring function that enables multiple transcripts to compete for ownership of putative probes. We apply GenRate to a new exon tiling dataset from mouse chromosome 4 and show that it makes significantly more predictions than a previously described hierarchical clustering method at the same false positive rate. GenRate correctly predicts many known genes and also predicts new gene structures. As new problems arise, additional hidden variables can be incorporated into the model in a principled fashion, so we believe that GenRate will prove to be a useful tool in the new era of genome-wide tiling microarray analysis.
- Published
- 2006
- Full Text
- View/download PDF
384. Inferring global levels of alternative splicing isoforms using a generative model of microarray data.
- Author
-
Shai O, Morris QD, Blencowe BJ, and Frey BJ
- Subjects
- Artificial Intelligence, Computer Simulation, Models, Statistical, Pattern Recognition, Automated methods, Algorithms, Alternative Splicing genetics, Models, Genetic, Oligonucleotide Array Sequence Analysis methods, RNA, Messenger genetics, Sequence Analysis, RNA methods
- Abstract
Motivation: Alternative splicing (AS) is a frequent step in metozoan gene expression whereby the exons of genes are spliced in different combinations to generate multiple isoforms of mature mRNA. AS functions to enrich an organism's proteomic complexity and regulates gene expression. Despite its importance, the mechanisms underlying AS and its regulation are not well understood, especially in the context of global gene expression patterns. We present here an algorithm referred to as the Generative model for the Alternative Splicing Array Platform (GenASAP) that can predict the levels of AS for thousands of exon skipping events using data generated from custom microarrays. GenASAP uses Bayesian learning in an unsupervised probability model to accurately predict AS levels from the microarray data. GenASAP is capable of learning the hybridization profiles of microarray data, while modeling noise processes and missing or aberrant data. GenASAP has been successfully applied to the global discovery and analysis of AS in mammalian cells and tissues., Results: GenASAP was applied to data obtained from a custom microarray designed for the monitoring of 3126 AS events in mouse cells and tissues. The microarray design included probes specific for exon body and junction sequences formed by the splicing of exons. Our results show that GenASAP provides accurate predictions for over one-third of the total events, as verified by independent RT-PCR assays., Supplementary Information: http://www.psi.toronto.edu/GenASAP.
- Published
- 2006
- Full Text
- View/download PDF
385. Genome-wide analysis of mouse transcripts using exon microarrays and factor graphs.
- Author
-
Frey BJ, Mohammad N, Morris QD, Zhang W, Robinson MD, Mnaimneh S, Chang R, Pan Q, Sat E, Rossant J, Bruneau BG, Aubin JE, Blencowe BJ, and Hughes TR
- Subjects
- Algorithms, Animals, Gene Expression Profiling, Humans, Mice, Microarray Analysis, RNA, Messenger chemistry, RNA, Messenger metabolism, Computational Biology, DNA, Complementary chemistry, Databases as Topic, Exons genetics, Genome, Transcription, Genetic
- Abstract
Recent mammalian microarray experiments detected widespread transcription and indicated that there may be many undiscovered multiple-exon protein-coding genes. To explore this possibility, we labeled cDNA from unamplified, polyadenylation-selected RNA samples from 37 mouse tissues to microarrays encompassing 1.14 million exon probes. We analyzed these data using GenRate, a Bayesian algorithm that uses a genome-wide scoring function in a factor graph to infer genes. At a stringent exon false detection rate of 2.7%, GenRate detected 12,145 gene-length transcripts and confirmed 81% of the 10,000 most highly expressed known genes. Notably, our analysis showed that most of the 155,839 exons detected by GenRate were associated with known genes, providing microarray-based evidence that most multiple-exon genes have already been identified. GenRate also detected tens of thousands of potential new exons and reconciled discrepancies in current cDNA databases by 'stitching' new transcribed regions into previously annotated genes.
- Published
- 2005
- Full Text
- View/download PDF
386. Network news: functional modules revealed during early embryogenesis in C. elegans.
- Author
-
Roy PJ and Morris Q
- Subjects
- Animals, Caenorhabditis elegans genetics, Gene Expression Regulation, Developmental, RNA, Messenger genetics, Caenorhabditis elegans embryology, Caenorhabditis elegans physiology, Computational Biology, Embryonic Development
- Abstract
The functional module is fast becoming the operational unit of the postgenomics era. A new report in Nature by Gunsalus and colleagues describes, using a multiply supported network, functional modules within early C. elegans embryos and identifies several new components of known molecular machines (Gunsalus et al., 2005).
- Published
- 2005
- Full Text
- View/download PDF
387. Multi-way clustering of microarray data using probabilistic sparse matrix factorization.
- Author
-
Dueck D, Morris QD, and Frey BJ
- Subjects
- Algorithms, Animals, Genome, Humans, Likelihood Functions, Models, Statistical, Probability, RNA, Messenger metabolism, Software, Cluster Analysis, Computational Biology methods, Oligonucleotide Array Sequence Analysis methods
- Abstract
Motivation: We address the problem of multi-way clustering of microarray data using a generative model. Our algorithm, probabilistic sparse matrix factorization (PSMF), is a probabilistic extension of a previous hard-decision algorithm for this problem. PSMF allows for varying levels of sensor noise in the data, uncertainty in the hidden prototypes used to explain the data and uncertainty as to the prototypes selected to explain each data vector., Results: We present experimental results demonstrating that our method can better recover functionally-relevant clusterings in mRNA expression data than standard clustering techniques, including hierarchical agglomerative clustering, and we show that by computing probabilities instead of point estimates, our method avoids converging to poor solutions.
- Published
- 2005
- Full Text
- View/download PDF
388. GenXHC: a probabilistic generative model for cross-hybridization compensation in high-density genome-wide microarray data.
- Author
-
Huang JC, Morris QD, Hughes TR, and Frey BJ
- Subjects
- Animals, DNA, Complementary metabolism, Exons, Genome, Hybridization, Genetic, Mice, Models, Statistical, Nucleic Acid Hybridization, Nucleotides chemistry, Oligonucleotides chemistry, Probability, RNA, Messenger metabolism, Software, Computational Biology methods, Oligonucleotide Array Sequence Analysis methods
- Abstract
Motivation: Microarray designs containing millions to hundreds of millions of probes that tile entire genomes are currently being released. Within the next 2 months, our group will release a microarray data set containing over 12,000,000 microarray measurements taken from 37 mouse tissues. A problem that will become increasingly significant in the upcoming era of genome-wide exon-tiling microarray experiments is the removal of cross-hybridization noise. We present a probabilistic generative model for cross-hybridization in microarray data and a corresponding variational learning method for cross-hybridization compensation, GenXHC, that reduces cross-hybridization noise by taking into account multiple sources for each mRNA expression level measurement, as well as prior knowledge of hybridization similarities between the nucleotide sequences of microarray probes and their target cDNAs., Results: The algorithm is applied to a subset of an exon-resolution genome-wide Agilent microarray data set for chromosome 16 of Mus musculus and is found to produce statistically significant reductions in cross-hybridization noise. The denoised data is found to produce enrichment in multiple gene ontology-biological process (GO-BP) functional groups. The algorithm is found to outperform robust multi-array analysis, another method for cross-hybridization compensation.
- Published
- 2005
- Full Text
- View/download PDF
389. Alternative splicing of conserved exons is frequently species-specific in human and mouse.
- Author
-
Pan Q, Bakowski MA, Morris Q, Zhang W, Frey BJ, Hughes TR, and Blencowe BJ
- Subjects
- Animals, Exons, Expressed Sequence Tags, Genome, Humans, Mice, Models, Genetic, Protein Structure, Tertiary, Software, Species Specificity, Alternative Splicing
- Abstract
In this article, we provide evidence that a frequent source of diversity between mammalian transcripts occurs as a consequence of species-specific alternative splicing (AS) of conserved exons. Using a highly predictive computational method, we estimate that >11% of human and mouse cassette alternative exons undergo skipping in one species but constitutively splicing in the other. These species-specific AS events are predicted to modify conserved domains in proteins more frequently than other classes of AS events. The results thus provide evidence that species-specific AS of conserved exons constitutes an additional potential source of complexity and species-specific differences between mammals.
- Published
- 2005
- Full Text
- View/download PDF
390. Detection and discovery of RNA modifications using microarrays.
- Author
-
Hiley SL, Jackman J, Babak T, Trochesset M, Morris QD, Phizicky E, and Hughes TR
- Subjects
- Mutation, RNA, Fungal chemistry, RNA, Fungal metabolism, RNA, Transfer chemistry, RNA, Transfer metabolism, RNA, Untranslated chemistry, Saccharomyces cerevisiae enzymology, Saccharomyces cerevisiae genetics, Oligonucleotide Array Sequence Analysis methods, RNA Processing, Post-Transcriptional, RNA, Untranslated metabolism, Saccharomyces cerevisiae metabolism
- Abstract
Using a microarray that tiles all known yeast non-coding RNAs, we compared RNA from wild-type cells with RNA from mutants encoding known and putative RNA modifying enzymes. We show that at least five types of RNA modification (dihydrouridine, m1G, m2(2)G, m1A and m6(2)A) catalyzed by 10 different enzymes (Trm1p, Trm5, Trm10p, Dus1p-Dus4p, Dim1p, Gcd10p and Gcd14p) can be detected by virtue of differential hybridization to oligonucleotides on the array that are complementary to the modified sites. Using this approach, we identified a previously undetected m1A modification in GlnCTG tRNA, the formation of which is catalyzed by the Gcd10/Gcd14 complex. complex.
- Published
- 2005
- Full Text
- View/download PDF
391. Genrate: a generative model that finds and scores new genes and exons in genomic microarray data.
- Author
-
Frey BJ, Morris QD, Zhang W, Mohammad N, and Hughes TR
- Subjects
- Animals, Computational Biology methods, Databases, Nucleic Acid, Mice, Probability, Software, Genomics, Models, Genetic, Oligonucleotide Array Sequence Analysis
- Abstract
Recently, researchers have made some progress in using microarrays to validate predicted exons in genome sequence and find new gene structures. However, current methods rely on separately making threshold-based decisions on intensity of expression, similarity of expression profiles, and arrangements of exons in the genome. We have taken a Bayesian approach and developed GenRate, a generative model that accounts for both genome-wide expression data taken from multiple conditions (e.g. tissues) and co-location and density of probes in DNA sequence data. GenRate balances probabilistic evidence derived from different sources and outputs scores (log-likelihoods) for each gene model, enabling the estimation of false-positive and false-negative rates. The model has a number of local minima that is exponential in the length of the DNA sequence data, so direct application of the EM learning algorithm produces poor results. We describe a novel way of parameterizing the model using examples from the data set, so that good solutions are found using an efficient algorithm. We apply GenRate to a subset of mouse genome-wide expression data that we have created, and discuss the statistical significance of the genes found by GenRate. Three of the highest-ranking gene structures found by GenRate, each containing thousands of bases from the genome, are confirmed using RT-PCR experiments.
- Published
- 2005
392. Revealing global regulatory features of mammalian alternative splicing using a quantitative microarray platform.
- Author
-
Pan Q, Shai O, Misquitta C, Zhang W, Saltzman AL, Mohammad N, Babak T, Siu H, Hughes TR, Morris QD, Frey BJ, and Blencowe BJ
- Subjects
- Animals, Brain metabolism, Evolution, Molecular, Male, Mice, Organ Specificity, Testis metabolism, Alternative Splicing physiology, Gene Expression Regulation physiology, Oligonucleotide Array Sequence Analysis, RNA metabolism
- Abstract
We describe the application of a microarray platform, which combines information from exon body and splice-junction probes, to perform a quantitative analysis of tissue-specific alternative splicing (AS) for thousands of exons in mammalian cells. Through this system, we have analyzed global features of AS in major mouse tissues. The results provide numerous inferences for the functions of tissue-specific AS, insights into how the evolutionary history of exons can impact on their inclusion levels, and also information on how global regulatory properties of AS define tissue type. Like global transcription profiles, global AS profiles reflect tissue identity. Interestingly, we find that transcription and AS act independently on different sets of genes in order to define tissue-specific expression profiles. These results demonstrate the utility of our quantitative microarray platform and data for revealing important global regulatory features of AS.
- Published
- 2004
- Full Text
- View/download PDF
393. Transcriptional networks: reverse-engineering gene regulation on a global scale.
- Author
-
Chua G, Robinson MD, Morris Q, and Hughes TR
- Subjects
- Computational Biology methods, Oligonucleotide Array Sequence Analysis, Regulatory Sequences, Nucleic Acid, Saccharomyces cerevisiae genetics, Saccharomyces cerevisiae Proteins chemistry, Saccharomyces cerevisiae Proteins genetics, Transcription Factors genetics, Transcription Factors metabolism, Gene Expression Regulation, Fungal, Saccharomyces cerevisiae metabolism, Saccharomyces cerevisiae Proteins metabolism, Transcription, Genetic
- Abstract
A major objective in post-genome research is to fully understand the transcriptional control of each gene and the targets of each transcription factor. In yeast, large-scale experimental and computational approaches have been applied to identify co-regulated genes, cis regulatory elements, and transcription factor DNA binding sites in vivo. Methods for modeling and predicting system behavior, and for reconciling discrepancies among data types, are being explored. The results indicate that a complete and comprehensive yeast transcriptional network will ultimately be achieved.
- Published
- 2004
- Full Text
- View/download PDF
394. Probing microRNAs with microarrays: tissue specificity and functional inference.
- Author
-
Babak T, Zhang W, Morris Q, Blencowe BJ, and Hughes TR
- Subjects
- Animals, Female, Fluorescent Dyes, Gene Expression Profiling, Male, Organ Specificity, RNA Processing, Post-Transcriptional, RNA, Messenger analysis, Sensitivity and Specificity, Sequence Analysis, DNA, Tissue Distribution, Mice genetics, MicroRNAs genetics, MicroRNAs metabolism, Oligonucleotide Array Sequence Analysis
- Abstract
MicroRNAs (miRNAs) are short, stable, noncoding RNAs involved in post-transcriptional gene silencing via hybridization to mRNA. Few have been thoroughly characterized in any species. Here, we describe a method to detect miRNAs using micro-arrays, in which the miRNAs are directly hybridized to the array. We used this method to analyze miRNA expression across 17 mouse organs and tissues. More than half of the 78 miRNAs detected were expressed in specific adult tissues, suggesting that miRNAs have widespread regulatory roles in adults. By comparing miRNA levels to mRNA levels determined in a parallel microarray analysis of the same tissues, we found that the expression of target mRNAs predicted on the basis of sequence complementarity is unrelated to the tissues in which the corresponding miRNA is expressed.
- Published
- 2004
- Full Text
- View/download PDF
395. Exploration of essential gene functions via titratable promoter alleles.
- Author
-
Mnaimneh S, Davierwala AP, Haynes J, Moffat J, Peng WT, Zhang W, Yang X, Pootoolal J, Chua G, Lopez A, Trochesset M, Morse D, Krogan NJ, Hiley SL, Li Z, Morris Q, Grigull J, Mitsakakis N, Roberts CJ, Greenblatt JF, Boone C, Kaiser CA, Andrews BJ, and Hughes TR
- Subjects
- Feedback, Physiological, Gene Deletion, Gene Expression Profiling, Genes, Fungal, Mitochondria metabolism, Models, Genetic, Oligonucleotide Array Sequence Analysis, Pharmaceutical Preparations metabolism, Protein Processing, Post-Translational, RNA, Transfer metabolism, Ribosomal Proteins genetics, Ribosomal Proteins metabolism, Saccharomyces cerevisiae drug effects, Saccharomyces cerevisiae genetics, Saccharomyces cerevisiae metabolism, Saccharomyces cerevisiae Proteins genetics, Saccharomyces cerevisiae Proteins metabolism, Transcription, Genetic, Alleles, Gene Expression Regulation, Fungal, Genes, Essential, Promoter Regions, Genetic
- Abstract
Nearly 20% of yeast genes are required for viability, hindering genetic analysis with knockouts. We created promoter-shutoff strains for over two-thirds of all essential yeast genes and subjected them to morphological analysis, size profiling, drug sensitivity screening, and microarray expression profiling. We then used this compendium of data to ask which phenotypic features characterized different functional classes and used these to infer potential functions for uncharacterized genes. We identified genes involved in ribosome biogenesis (HAS1, URB1, and URB2), protein secretion (SEC39), mitochondrial import (MIM1), and tRNA charging (GSN1). In addition, apparent negative feedback transcriptional regulation of both ribosome biogenesis and the proteasome was observed. We furthermore show that these strains are compatible with automated genetic analysis. This study underscores the importance of analyzing mutant phenotypes and provides a resource to complement the yeast knockout collection.
- Published
- 2004
- Full Text
- View/download PDF
396. The functional landscape of mouse gene expression.
- Author
-
Zhang W, Morris QD, Chang R, Shai O, Bakowski MA, Mitsakakis N, Mohammad N, Robinson MD, Zirngibl R, Somogyi E, Laurin N, Eftekharpour E, Sat E, Grigull J, Pan Q, Peng WT, Krogan N, Greenblatt J, Fehlings M, van der Kooy D, Aubin J, Bruneau BG, Rossant J, Blencowe BJ, Frey BJ, and Hughes TR
- Subjects
- Animals, Computational Biology, Organ Specificity, RNA, Messenger analysis, RNA, Messenger genetics, Reproducibility of Results, Transcription, Genetic genetics, Gene Expression Profiling, Gene Expression Regulation, Genomics, Mice genetics, Oligonucleotide Array Sequence Analysis
- Abstract
Background: Large-scale quantitative analysis of transcriptional co-expression has been used to dissect regulatory networks and to predict the functions of new genes discovered by genome sequencing in model organisms such as yeast. Although the idea that tissue-specific expression is indicative of gene function in mammals is widely accepted, it has not been objectively tested nor compared with the related but distinct strategy of correlating gene co-expression as a means to predict gene function., Results: We generated microarray expression data for nearly 40,000 known and predicted mRNAs in 55 mouse tissues, using custom-built oligonucleotide arrays. We show that quantitative transcriptional co-expression is a powerful predictor of gene function. Hundreds of functional categories, as defined by Gene Ontology 'Biological Processes', are associated with characteristic expression patterns across all tissues, including categories that bear no overt relationship to the tissue of origin. In contrast, simple tissue-specific restriction of expression is a poor predictor of which genes are in which functional categories. As an example, the highly conserved mouse gene PWP1 is widely expressed across different tissues but is co-expressed with many RNA-processing genes; we show that the uncharacterized yeast homolog of PWP1 is required for rRNA biogenesis., Conclusions: We conclude that 'functional genomics' strategies based on quantitative transcriptional co-expression will be as fruitful in mammals as they have been in simpler organisms, and that transcriptional control of mammalian physiology is more modular than is generally appreciated. Our data and analyses provide a public resource for mammalian functional genomics.
- Published
- 2004
- Full Text
- View/download PDF
397. A panoramic view of yeast noncoding RNA processing.
- Author
-
Peng WT, Robinson MD, Mnaimneh S, Krogan NJ, Cagney G, Morris Q, Davierwala AP, Grigull J, Yang X, Zhang W, Mitsakakis N, Ryan OW, Datta N, Jojic V, Pal C, Canadien V, Richards D, Beattie B, Wu LF, Altschuler SJ, Roweis S, Frey BJ, Emili A, Greenblatt JF, and Hughes TR
- Subjects
- Cells, Cultured, Fungal Proteins genetics, Fungal Proteins isolation & purification, Oligonucleotide Array Sequence Analysis, Phenotype, RNA Precursors biosynthesis, RNA Precursors genetics, RNA, Small Nucleolar biosynthesis, RNA, Small Nucleolar genetics, RNA, Transfer biosynthesis, RNA, Transfer genetics, RNA, Untranslated genetics, Yeasts genetics, Gene Expression Regulation, Fungal genetics, Genome, Fungal, Mutation genetics, RNA, Untranslated biosynthesis, Ribonucleoproteins biosynthesis, Yeasts metabolism
- Abstract
Predictive analysis using publicly available yeast functional genomics and proteomics data suggests that many more proteins may be involved in biogenesis of ribonucleoproteins than are currently known. Using a microarray that monitors abundance and processing of noncoding RNAs, we analyzed 468 yeast strains carrying mutations in protein-coding genes, most of which have not previously been associated with RNA or RNP synthesis. Many strains mutated in uncharacterized genes displayed aberrant noncoding RNA profiles. Ten factors involved in noncoding RNA biogenesis were verified by further experimentation, including a protein required for 20S pre-rRNA processing (Tsr2p), a protein associated with the nuclear exosome (Lrp1p), and a factor required for box C/D snoRNA accumulation (Bcd1p). These data present a global view of yeast noncoding RNA processing and confirm that many currently uncharacterized yeast proteins are involved in biogenesis of noncoding RNA.
- Published
- 2003
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.