54 results on '"Ramoni MF"'
Search Results
2. Integrative predictive model of coronary artery calcification in atherosclerosis.
- Author
-
McGeachie M, Ramoni RL, Mychaleckyj JC, Furie KL, Dreyfuss JM, Liu Y, Herrington D, Guo X, Lima JA, Post W, Rotter JI, Rich S, Sale M, Ramoni MF, McGeachie, Michael, Ramoni, Rachel L Badovinac, Mychaleckyj, Josyf C, Furie, Karen L, Dreyfuss, Jonathan M, and Liu, Yongmei
- Published
- 2009
- Full Text
- View/download PDF
3. Predictive genomics of cardioembolic stroke.
- Author
-
Ramoni RB, Himes BE, Sale MM, Furie KL, Ramoni MF, Ramoni, Rachel Badovinac, Himes, Blanca E, Sale, Michele M, Furie, Karen L, and Ramoni, Marco F
- Published
- 2009
- Full Text
- View/download PDF
4. How accurate can genetic predictions be?
- Author
-
Dreyfuss JM, Levner D, Galagan JE, Church GM, and Ramoni MF
- Subjects
- Area Under Curve, Breast Neoplasms diagnosis, Computer Simulation, Diabetes Mellitus, Type 2 diagnosis, Female, Genome-Wide Association Study, Humans, Precision Medicine, Predictive Value of Tests, Prognosis, ROC Curve, Breast Neoplasms genetics, Diabetes Mellitus, Type 2 genetics, Genome, Human, Models, Genetic
- Abstract
Background: Pre-symptomatic prediction of disease and drug response based on genetic testing is a critical component of personalized medicine. Previous work has demonstrated that the predictive capacity of genetic testing is constrained by the heritability and prevalence of the tested trait, although these constraints have only been approximated under the assumption of a normally distributed genetic risk distribution., Results: Here, we mathematically derive the absolute limits that these factors impose on test accuracy in the absence of any distributional assumptions on risk. We present these limits in terms of the best-case receiver-operating characteristic (ROC) curve, consisting of the best-case test sensitivities and specificities, and the AUC (area under the curve) measure of accuracy. We apply our method to genetic prediction of type 2 diabetes and breast cancer, and we additionally show the best possible accuracy that can be obtained from integrated predictors, which can incorporate non-genetic features., Conclusion: Knowledge of such limits is valuable in understanding the implications of genetic testing even before additional associations are identified.
- Published
- 2012
- Full Text
- View/download PDF
5. Glycosaminoglycans and glucose prevent apoptosis in 4-methylumbelliferone-treated human aortic smooth muscle cells.
- Author
-
Vigetti D, Rizzi M, Moretto P, Deleonibus S, Dreyfuss JM, Karousou E, Viola M, Clerici M, Hascall VC, Ramoni MF, De Luca G, and Passi A
- Subjects
- Cell Movement, Cell Proliferation, Glycoproteins metabolism, Humans, Hyaluronan Receptors biosynthesis, Hymecromone pharmacology, Inflammation, Mitogen-Activated Protein Kinase 1 metabolism, Mitogen-Activated Protein Kinase 3 metabolism, Oligonucleotide Array Sequence Analysis, Phosphatidylinositol 3-Kinases metabolism, Toll-Like Receptor 4 metabolism, Aorta pathology, Apoptosis, Glucose metabolism, Glycosaminoglycans metabolism, Hyaluronic Acid pharmacology, Hymecromone analogs & derivatives, Myocytes, Smooth Muscle cytology
- Abstract
Smooth muscle cells (SMCs) have a pivotal role in cardiovascular diseases and are responsible for hyaluronan (HA) deposition in thickening vessel walls. HA regulates SMC proliferation, migration, and inflammation, which accelerates neointima formation. We used the HA synthesis inhibitor 4-methylumbelliferone (4-MU) to reduce HA production in human aortic SMCs and found a significant increase of apoptotic cells. Interestingly, the exogenous addition of HA together with 4-MU reduced apoptosis. A similar anti-apoptotic effect was observed also by adding other glycosaminoglycans and glucose to 4-MU-treated cells. Furthermore, the anti-apoptotic effect of HA was mediated by Toll-like receptor 4, CD44, and PI3K but not by ERK1/2.
- Published
- 2011
- Full Text
- View/download PDF
6. Inferring cell cycle feedback regulation from gene expression data.
- Author
-
Ferrazzi F, Engel FB, Wu E, Moseman AP, Kohane IS, Bellazzi R, and Ramoni MF
- Subjects
- Bayes Theorem, Computer Simulation, Databases, Genetic, HeLa Cells, Humans, Models, Biological, Cell Cycle genetics, Computational Biology methods, Feedback, Physiological physiology, Gene Expression, Gene Regulatory Networks
- Abstract
Feedback control is an important regulatory process in biological systems, which confers robustness against external and internal disturbances. Genes involved in feedback structures are therefore likely to have a major role in regulating cellular processes. Here we rely on a dynamic Bayesian network approach to identify feedback loops in cell cycle regulation. We analyzed the transcriptional profile of the cell cycle in HeLa cancer cells and identified a feedback loop structure composed of 10 genes. In silico analyses showed that these genes hold important roles in system's dynamics. The results of published experimental assays confirmed the central role of 8 of the identified feedback loop genes in cell cycle regulation. In conclusion, we provide a novel approach to identify critical genes for the dynamics of biological processes. This may lead to the identification of therapeutic targets in diseases that involve perturbations of these dynamics., (Copyright © 2011 Elsevier Inc. All rights reserved.)
- Published
- 2011
- Full Text
- View/download PDF
7. A transcriptional network signature characterizes lung cancer subtypes.
- Author
-
Chang HH, Dreyfuss JM, and Ramoni MF
- Subjects
- Bayes Theorem, Chromosomes, Human, Pair 12, Humans, Systems Biology methods, Adenocarcinoma genetics, Carcinoma, Squamous Cell genetics, Gene Regulatory Networks, Lung Neoplasms classification, Lung Neoplasms genetics
- Abstract
Background: Transcriptional networks play a central role in cancer development. The authors described a systems biology approach to cancer classification based on the reverse engineering of the transcriptional network surrounding the 2 most common types of lung cancer: adenocarcinoma (AC) and squamous cell carcinoma (SCC)., Methods: A transcriptional network classifier was inferred from the molecular profiles of 111 human lung carcinomas. The authors tested its classification accuracy in 7 independent cohorts, for a total of 422 subjects of Caucasian, African, and Asian descent., Results: The model for distinguishing AC from SCC was a 25-gene network signature. Its performance on the 7 independent cohorts achieved 95.2% classification accuracy. Even more surprisingly, 95% of this accuracy was explained by the interplay of 3 genes (KRT6A, KRT6B, KRT6C) on a narrow cytoband of chromosome 12. The role of this chromosomal region in distinguishing AC and SCC was further confirmed by the analysis of another group of 28 independent subjects assayed by DNA copy number changes. The copy number variations of bands 12q12, 12q13, and 12q12-13 discriminated these samples with 84% accuracy., Conclusions: These results suggest the existence of a robust signature localized in a relatively small area of the genome, and show the clinical potential of reverse engineering transcriptional networks from molecular profiles., (Copyright © 2010 American Cancer Society.)
- Published
- 2011
- Full Text
- View/download PDF
8. Is the reduction of dimensionality to a small number of features always necessary in constructing predictive models for analysis of complex diseases or behaviours?
- Author
-
Zollanvari A, Saccone NL, Bierut LJ, Ramoni MF, and Alterovitz G
- Subjects
- Humans, Behavior, Disease, Models, Theoretical
- Abstract
Gene expression and genome wide association data have provided researchers the opportunity to study many complex traits and diseases. When designing prognostic and predictive models capable of phenotypic classification in this area, significant reduction of dimensionality through stringent filtering and/or feature selection is often deemed imperative. Here, this work challenges this presumption through both theoretical and empirical analysis. This work demonstrates that by a proper compromise between structure of the selected model and the number of features, one is able to achieve better performance even in large dimensionality. The inclusion of many genes/variants in the classification rules can help shed new light on the analysis of complex traitstraits that are typically determined by many causal variants with small effect size.
- Published
- 2011
- Full Text
- View/download PDF
9. Genome-wide association for smoking cessation success in a trial of precessation nicotine replacement.
- Author
-
Uhl GR, Drgon T, Johnson C, Ramoni MF, Behm FM, and Rose JE
- Subjects
- Adult, Bayes Theorem, Carbon Monoxide analysis, Genetic Testing, Genotype, Humans, Polymorphism, Single Nucleotide, Tobacco Use Disorder genetics, Treatment Outcome, Genome-Wide Association Study, Nicotine therapeutic use, Smoking genetics, Smoking therapy, Smoking Cessation methods
- Abstract
Abilities to successfully quit smoking display substantial evidence for heritability in classic and molecular genetic studies. Genome-wide association (GWA) studies have demonstrated single-nucleotide polymorphisms (SNPs) and haplotypes that distinguish successful quitters from individuals who were unable to quit smoking in clinical trial participants and in community samples. Many of the subjects in these clinical trial samples were aided by nicotine replacement therapy (NRT). We now report novel GWA results from participants in a clinical trial that sought dose/response relationships for "precessation" NRT. In this trial, 369 European-American smokers were randomized to 21 or 42 mg NRT, initiated 2 wks before target quit dates. Ten-week continuous smoking abstinence was assessed on the basis of self-reports and carbon monoxide levels. SNP genotyping used Affymetrix 6.0 arrays. GWA results for smoking cessation success provided no P value that reached "genome-wide" significance. Compared with chance, these results do identify (a) more clustering of nominally positive results within small genomic regions, (b) more overlap between these genomic regions and those identified in six prior successful smoking cessation GWA studies and (c) sets of genes that fall into gene ontology categories that appear to be biologically relevant. The 1,000 SNPs with the strongest associations form a plausible Bayesian network; no such network is formed by randomly selected sets of SNPs. The data provide independent support, based on individual genotyping, for many loci previously nominated on the basis of data from genotyping in pooled DNA samples. These results provide further support for the idea that aid for smoking cessation may be personalized on the basis of genetic predictors of outcome.
- Published
- 2010
- Full Text
- View/download PDF
10. Association of linear growth impairment in pediatric Crohn's disease and a known height locus: a pilot study.
- Author
-
Lee JJ, Essers JB, Kugathasan S, Escher JC, Lettre G, Butler JL, Stephens MC, Ramoni MF, Grand RJ, and Hirschhorn J
- Subjects
- Adolescent, Child, Child, Preschool, Crohn Disease genetics, Cross-Sectional Studies, Female, Genetic Predisposition to Disease, Genotype, Growth Disorders genetics, Humans, Infant, Intracellular Signaling Peptides and Proteins, Male, Pilot Projects, Proteins metabolism, White People, Body Height genetics, Growth Disorders etiology
- Abstract
The etiology of growth impairment in Crohn's disease (CD) has been inadequately explained by nutritional, hormonal, and/or disease-related factors, suggesting that genetics may be an additional contributor. The aim of this cross-sectional study was to investigate genetic variants associated with linear growth in pediatric-onset CD. We genotyped 951 subjects (317 CD patient-parent trios) for 64 polymorphisms within 14 CD-susceptibility and 23 stature-associated loci. Patient height-for-age Z-score < -1.64 was used to dichotomize probands into growth-impaired and nongrowth-impaired groups. The transmission disequilibrium test (TDT) was used to study association to growth impairment. There was a significant association between growth impairment in CD (height-for-age Z-score < -1.64) and a stature-related polymorphism in the dymeclin gene DYM (rs8099594) (OR = 3.2, CI [1.57-6.51], p = 0.0007). In addition, there was nominal over-transmission of two CD-susceptibility alleles, 10q21.1 intergenic region (rs10761659) and ATG16L1 (rs10210302), in growth-impaired CD children (OR = 2.36, CI [1.26-4.41] p = 0.0056 and OR = 2.45, CI [1.22-4.95] p = 0.0094, respectively). Our data indicate that genetic influences due to stature-associated and possibly CD risk alleles may predispose CD patients to alterations in linear growth. This is the first report of a link between a stature-associated locus and growth impairment in CD., (© 2010 The Authors Annals of Human Genetics © 2010 Blackwell Publishing Ltd/University College London.)
- Published
- 2010
- Full Text
- View/download PDF
11. Mapping transcription mechanisms from multimodal genomic data.
- Author
-
Chang HH, McGeachie M, Alterovitz G, and Ramoni MF
- Subjects
- Gene Expression, Gene Expression Profiling, Genetic Variation, Humans, Leukemia genetics, Polymorphism, Single Nucleotide, Quantitative Trait Loci, Computational Biology methods, Genome, Transcription, Genetic genetics
- Abstract
Background: Identification of expression quantitative trait loci (eQTLs) is an emerging area in genomic study. The task requires an integrated analysis of genome-wide single nucleotide polymorphism (SNP) data and gene expression data, raising a new computational challenge due to the tremendous size of data., Results: We develop a method to identify eQTLs. The method represents eQTLs as information flux between genetic variants and transcripts. We use information theory to simultaneously interrogate SNP and gene expression data, resulting in a Transcriptional Information Map (TIM) which captures the network of transcriptional information that links genetic variations, gene expression and regulatory mechanisms. These maps are able to identify both cis- and trans- regulating eQTLs. The application on a dataset of leukemia patients identifies eQTLs in the regions of the GART, PCP4, DSCAM, and RIPK4 genes that regulate ADAMTS1, a known leukemia correlate., Conclusions: The information theory approach presented in this paper is able to infer the dependence networks between SNPs and transcripts, which in turn can identify cis- and trans-eQTLs. The application of our method to the leukemia study explains how genetic variants and gene expression are linked to leukemia.
- Published
- 2010
- Full Text
- View/download PDF
12. A systems biology approach to modeling vibrio cholerae gene expression under virulence-inducing conditions.
- Author
-
Kanjilal S, Citorik R, LaRocque RC, Ramoni MF, and Calderwood SB
- Subjects
- Bacterial Proteins genetics, Bacteriological Techniques, Cholera Toxin genetics, Cholera Toxin metabolism, Cluster Analysis, Culture Media, Oligonucleotide Array Sequence Analysis, Time Factors, Vibrio cholerae genetics, Vibrio cholerae metabolism, Virulence, Bacterial Proteins metabolism, Gene Expression Profiling, Gene Expression Regulation, Bacterial, Systems Biology methods, Vibrio cholerae growth & development, Vibrio cholerae pathogenicity
- Abstract
Vibrio cholerae is a Gram-negative bacillus that is the causative agent of cholera. Pathogenesis in vivo occurs through a series of spatiotemporally controlled events under the control of a gene cascade termed the ToxR regulon. Major genes in the ToxR regulon include the master regulators toxRS and tcpPH, the downstream regulator toxT, and virulence factors, the ctxAB and tcpA operons. Our current understanding of the dynamics of virulence gene expression is limited to microarray analyses of expression at selected time points. To better understand this process, we utilized a systems biology approach to examine the temporal regulation of gene expression in El Tor V. cholerae grown under virulence-inducing conditions in vitro (AKI medium), using high-resolution time series genomic profiling. Results showed that overall gene expression in AKI medium mimics that of in vivo studies but with less clear temporal separation between upstream regulators and downstream targets. Expression of toxRS was unaffected by growth under virulence-inducing conditions, but expression of toxT was activated shortly after switching from stationary to aerating conditions. The tcpA operon was also activated early during mid-exponential-phase growth, while the ctxAB operon was turned on later, after the rise in toxT expression. Expression of ctxAB continued to rise despite an eventual decrease in toxT. Cluster analysis of gene expression highlighted 15 hypothetical genes and six genes related to environmental information processing that represent potential new members of the ToxR regulon. This study applies systems biology tools to analysis of gene expression of V. cholerae in vitro and provides an important comparator for future studies done in vivo.
- Published
- 2010
- Full Text
- View/download PDF
13. Ontology engineering.
- Author
-
Alterovitz G, Xiang M, Hill DP, Lomax J, Liu J, Cherkassky M, Dreyfuss J, Mungall C, Harris MA, Dolan ME, Blake JA, and Ramoni MF
- Subjects
- Computational Biology methods, Data Mining methods, Genes genetics, Genetic Engineering statistics & numerical data, Proteins classification, Proteins genetics, Terminology as Topic
- Published
- 2010
- Full Text
- View/download PDF
14. The challenges of informatics in synthetic biology: from biomolecular networks to artificial organisms.
- Author
-
Alterovitz G, Muso T, and Ramoni MF
- Subjects
- Algorithms, Base Sequence, DNA, Molecular Sequence Data, Sequence Homology, Nucleic Acid, Software, Systems Biology
- Abstract
The field of synthetic biology holds an inspiring vision for the future; it integrates computational analysis, biological data and the systems engineering paradigm in the design of new biological machines and systems. These biological machines are built from basic biomolecular components analogous to electrical devices, and the information flow among these components requires the augmentation of biological insight with the power of a formal approach to information management. Here we review the informatics challenges in synthetic biology along three dimensions: in silico, in vitro and in vivo. First, we describe state of the art of the in silico support of synthetic biology, from the specific data exchange formats, to the most popular software platforms and algorithms. Next, we cast in vitro synthetic biology in terms of information flow, and discuss genetic fidelity in DNA manipulation, development strategies of biological parts and the regulation of biomolecular networks. Finally, we explore how the engineering chassis can manipulate biological circuitries in vivo to give rise to future artificial organisms.
- Published
- 2010
- Full Text
- View/download PDF
15. Dynamic gene expression analysis links melanocyte growth arrest with nevogenesis.
- Author
-
Yang G, Thieu K, Tsai KY, Piris A, Udayakumar D, Njauw CN, Ramoni MF, and Tsao H
- Subjects
- Bayes Theorem, Cell Growth Processes genetics, Cell Transformation, Neoplastic pathology, Cluster Analysis, Gene Expression Profiling methods, Humans, Melanocytes cytology, Melanoma pathology, Nevus pathology, Cell Transformation, Neoplastic genetics, Melanocytes physiology, Melanoma genetics, Nevus genetics
- Abstract
Like all primary cells in vitro, normal human melanocytes exhibit a physiologic decay in proliferative potential as it transitions to a growth-arrested state. The underlying transcriptional program(s) that regulate this phenotypic change is largely unknown. To identify molecular determinants of this process, we performed a Bayesian-based dynamic gene expression analysis on primary melanocytes undergoing proliferative arrest. This analysis revealed several related clusters whose expression behavior correlated with the melanocyte growth kinetics; we designated these clusters the melanocyte growth arrest program (MGAP). These MGAP genes were preferentially represented in benign melanocytic nevi over melanomas and selectively mapped to the hepatocyte fibrosis pathway. This transcriptional relationship between melanocyte growth stasis, nevus biology, and fibrogenic signaling was further validated in vivo by the demonstration of strong pericellular collagen deposition within benign nevi but not melanomas. Taken together, our study provides a novel view of fibroplasia in both melanocyte biology and nevogenesis.
- Published
- 2009
- Full Text
- View/download PDF
16. Transcriptional network classifiers.
- Author
-
Chang HH and Ramoni MF
- Subjects
- Aortic Aneurysm, Thoracic diagnosis, Aortic Aneurysm, Thoracic genetics, Bayes Theorem, Humans, Lung Neoplasms diagnosis, Lung Neoplasms genetics, Systems Biology, Gene Regulatory Networks, Transcription, Genetic
- Abstract
Background: Gene interactions play a central role in transcriptional networks. Many studies have performed genome-wide expression analysis to reconstruct regulatory networks to investigate disease processes. Since biological processes are outcomes of regulatory gene interactions, this paper develops a system biology approach to infer function-dependent transcriptional networks modulating phenotypic traits, which serve as a classifier to identify tissue states. Due to gene interactions taken into account in the analysis, we can achieve higher classification accuracy than existing methods., Results: Our system biology approach is carried out by the Bayesian networks framework. The algorithm consists of two steps: gene filtering by Bayes factor followed by collinearity elimination via network learning. We validate our approach with two clinical data. In the study of lung cancer subtypes discrimination, we obtain a 25-gene classifier from 111 training samples, and the test on 422 independent samples achieves 95% classification accuracy. In the study of thoracic aortic aneurysm (TAA) diagnosis, 61 samples determine a 34-gene classifier, whose diagnosis accuracy on 33 independent samples achieves 82%. The performance comparisons with three other popular methods, PCA/LDA, PAM, and Weighted Voting, confirm that our approach yields superior classification accuracy and a more compact signature., Conclusions: The system biology approach presented in this paper is able to infer function-dependent transcriptional networks, which in turn can classify biological samples with high accuracy. The validation of our classifier using clinical data demonstrates the promising value of our proposed approach for disease diagnosis.
- Published
- 2009
- Full Text
- View/download PDF
17. Predicting response to short-acting bronchodilator medication using Bayesian networks.
- Author
-
Himes BE, Wu AC, Duan QL, Klanderman B, Litonjua AA, Tantisira K, Ramoni MF, and Weiss ST
- Subjects
- Asthma physiopathology, Bayes Theorem, Child, Data Interpretation, Statistical, Female, Genetic Variation, Genotype, Humans, Logistic Models, Male, Neural Networks, Computer, Pharmacogenetics, Polymorphism, Single Nucleotide, Predictive Value of Tests, Reproducibility of Results, Respiratory Function Tests, Asthma drug therapy, Asthma genetics, Bronchodilator Agents therapeutic use
- Abstract
Aims: Bronchodilator response tests measure the effect of beta(2)-agonists, the most commonly used short-acting reliever drugs for asthma. We sought to relate candidate gene SNP data with bronchodilator response and measure the predictive accuracy of a model constructed with genetic variants., Materials & Methods: Bayesian networks, multivariate models that are able to account for simultaneous associations and interactions among variables, were used to create a predictive model of bronchodilator response using candidate gene SNP data from 308 Childhood Asthma Management Program Caucasian subjects., Results: The model found that 15 SNPs in 15 genes predict bronchodilator response with fair accuracy, as established by a fivefold cross-validation area under the receiver-operating characteristic curve of 0.75 (standard error: 0.03)., Conclusion: Bayesian networks are an attractive approach to analyze large-scale pharmacogenetic SNP data because of their ability to automatically learn complex models that can be used for the prediction and discovery of novel biological hypotheses.
- Published
- 2009
- Full Text
- View/download PDF
18. Prediction of chronic obstructive pulmonary disease (COPD) in asthma patients using electronic medical records.
- Author
-
Himes BE, Dai Y, Kohane IS, Weiss ST, and Ramoni MF
- Subjects
- Aged, Comorbidity, Disease Progression, Female, Humans, Male, Middle Aged, Models, Biological, Multivariate Analysis, Natural Language Processing, ROC Curve, Risk Factors, Asthma complications, Bayes Theorem, Medical Records Systems, Computerized, Neural Networks, Computer, Pulmonary Disease, Chronic Obstructive etiology
- Abstract
Objective: Identify clinical factors that modulate the risk of progression to COPD among asthma patients using data extracted from electronic medical records., Design: Demographic information and comorbidities from adult asthma patients who were observed for at least 5 years with initial observation dates between 1988 and 1998, were extracted from electronic medical records of the Partners Healthcare System using tools of the National Center for Biomedical Computing "Informatics for Integrating Biology to the Bedside" (i2b2)., Measurements: A predictive model of COPD was constructed from a set of 9,349 patients (843 cases, 8,506 controls) using Bayesian networks. The model's predictive accuracy was tested using it to predict COPD in a future independent set of asthma patients (992 patients; 46 cases, 946 controls), who had initial observation dates between 1999 and 2002., Results: A Bayesian network model composed of age, sex, race, smoking history, and 8 comorbidity variables is able to predict COPD in the independent set of patients with an accuracy of 83.3%, computed as the area under the Receiver Operating Characteristic curve (AUROC)., Conclusions: Our results demonstrate that data extracted from electronic medical records can be used to create predictive models. With improvements in data extraction and inclusion of more variables, such models may prove to be clinically useful.
- Published
- 2009
- Full Text
- View/download PDF
19. Use of Bayesian networks to probabilistically model and improve the likelihood of validation of microarray findings by RT-PCR.
- Author
-
English SB, Shih SC, Ramoni MF, Smith LE, and Butte AJ
- Subjects
- Analysis of Variance, Animals, Diabetic Retinopathy genetics, Diabetic Retinopathy metabolism, Genomics, Hyperoxia genetics, Hyperoxia metabolism, Mice, Models, Statistical, RNA, Messenger analysis, Reproducibility of Results, Bayes Theorem, Gene Expression Profiling methods, Oligonucleotide Array Sequence Analysis, Reverse Transcriptase Polymerase Chain Reaction
- Abstract
Though genome-wide technologies, such as microarrays, are widely used, data from these methods are considered noisy; there is still varied success in downstream biological validation. We report a method that increases the likelihood of successfully validating microarray findings using real time RT-PCR, including genes at low expression levels and with small differences. We use a Bayesian network to identify the most relevant sources of noise based on the successes and failures in validation for an initial set of selected genes, and then improve our subsequent selection of genes for validation based on eliminating these sources of noise. The network displays the significant sources of noise in an experiment, and scores the likelihood of validation for every gene. We show how the method can significantly increase validation success rates. In conclusion, in this study, we have successfully added a new automated step to determine the contributory sources of noise that determine successful or unsuccessful downstream biological validation.
- Published
- 2009
- Full Text
- View/download PDF
20. A testable prognostic model of nicotine dependence.
- Author
-
Ramoni RB, Saccone NL, Hatsukami DK, Bierut LJ, and Ramoni MF
- Subjects
- Animals, Bayes Theorem, Genetic Association Studies, Humans, Prognosis, Tobacco Use Disorder genetics, Tobacco Use Disorder physiopathology, Models, Biological, Tobacco Use Disorder diagnosis
- Abstract
Individuals' dependence on nicotine, primarily through cigarette smoking, is a major source of morbidity and mortality worldwide. Many smokers attempt but fail to quit smoking, motivating researchers to identify the origins of this dependence. Because of the known heritability of nicotine-dependence phenotypes, considerable interest has been focused on discovering the genetic factors underpinning the trait. This goal, however, is not easily attained: no single factor is likely to explain any great proportion of dependence because nicotine dependence is thought to be a complex trait (i.e., the result of many interacting factors). Genomewide association studies are powerful tools in the search for the genomic bases of complex traits, and in this context, novel candidate genes have been identified through single nucleotide polymorphism (SNP) association analyses. Beyond association, however, genetic data can be used to generate predictive models of nicotine dependence. As expected in the context of a complex trait, individual SNPs fail to accurately predict nicotine dependence, demanding the use of multivariate models. Standard approaches, such as logistic regression, are unable to consider large numbers of SNPs given existing sample sizes. However, using Bayesian networks, one can overcome these limitations to generate a multivariate predictive model, which has markedly enhanced predictive accuracy on fitted values relative to that of individual SNPs. This approach, combined with the data being generated by genomewide association studies, promises to shed new light on the common, complex trait nicotine dependence.
- Published
- 2009
- Full Text
- View/download PDF
21. Characterization of patients who suffer asthma exacerbations using data extracted from electronic medical records.
- Author
-
Himes BE, Kohane IS, Ramoni MF, and Weiss ST
- Subjects
- Boston epidemiology, Chronic Disease, Humans, Incidence, Risk Assessment methods, Risk Factors, Artificial Intelligence, Asthma diagnosis, Asthma epidemiology, Medical Records Systems, Computerized statistics & numerical data, Natural Language Processing, Pattern Recognition, Automated methods
- Abstract
The increasing availability of electronic medical records offers opportunities to better characterize patient populations and create predictive tools to individualize health care. We determined which asthma patients suffer exacerbations using data extracted from electronic medical records of the Partners Healthcare System using Natural Language Processing tools from the "Informatics for Integrating Biology to the Bedside" center (i2b2). Univariable and multivariable analysis of data for 11,356 patients (1,394 cases, 9,962 controls) found that race, BMI, smoking history, and age at initial observation are predictors of asthma exacerbations. The area under the receiver operating characteristic curve (AUROC) corresponding to prediction of exacerbations in an independent group of 1,436 asthma patients (106 cases, 1,330 controls) is 0.67. Our findings are consistent with previous characterizations of asthma patients in epidemiological studies, and demonstrate that data extracted by natural language processing from electronic medical records is suitable for the characterization of patient populations.
- Published
- 2008
22. Automated programming for bioinformatics algorithm deployment.
- Author
-
Alterovitz G, Jiwaji A, and Ramoni MF
- Subjects
- Systems Integration, Algorithms, Computational Biology methods, Computer Graphics, Programming Languages, Software, User-Computer Interface
- Abstract
Unlabelled: Many bioinformatics solutions suffer from the lack of usable interface/platform from which results can be analyzed and visualized. Overcoming this hurdle would allow for more widespread dissemination of bioinformatics algorithms within the biological and medical communities. The algorithms should be accessible without extensive technical support or programming knowledge. Here, we propose a dynamic wizard platform that provides users with a Graphical User Interface (GUI) for most Java bioinformatics library toolkits. The application interface is generated in real-time based on the original source code. This platform lets developers focus on designing algorithms and biologists/physicians on testing hypotheses and analyzing results., Availability: The open source code can be downloaded from: http://bcl.med.harvard.edu/proteomics/proj/APBA/.
- Published
- 2008
- Full Text
- View/download PDF
23. System-wide peripheral biomarker discovery using information theory.
- Author
-
Alterovitz G, Xiang M, Liu J, Chang A, and Ramoni MF
- Subjects
- Body Fluids chemistry, Computational Biology, Female, Humans, Male, Models, Statistical, Pregnancy, Proteomics statistics & numerical data, Tissue Distribution, Biomarkers analysis, Information Theory
- Abstract
The identification of reliable peripheral biomarkers for clinical diagnosis, patient prognosis, and biological functional studies would allow for access to biological information currently available only through invasive methods. Traditional approaches have so far considered aspects of tissues and biofluid markers independently. Here we introduce an information theoretic framework for biomarker discovery, integrating biofluid and tissue information. This allows us to identify tissue information in peripheral biofluids. We treat tissue-biofluid interactions as an information channel through functional space using 26 proteomes from 45 different sources to determine quantitatively the correspondence of each biofluid for specific tissues via relative entropy calculation of proteomes mapped onto phenotype, function, and drug space. Next, we identify candidate biofluids and biomarkers responsible for functional information transfer (p < 0.01). A total of 851 unique candidate biomarkers proxies were identified. The biomarkers were found to be significant functional tissue proxies compared to random proteins (p < 0.001). This proxy link is found to be further enhanced by filtering the biofluid proteins to include only significant tissue-biofluid information channels and is further validated by gene expression. Furthermore, many of the candidate biomarkers are novel and have yet to be explored. In addition to characterizing proteins and their interactions with a systemic perspective, our work can be used as a roadmap to guide biomedical investigation, from suggesting biofluids for study to constraining the search for biomarkers. This work has applications in disease screening, diagnosis, and protein function studies.
- Published
- 2008
24. Economic evaluation of a Bayesian model to predict late-phase success of new chemical entities.
- Author
-
Schachter AD, Ramoni MF, Baio G, Roberts TG Jr, and Finkelstein SN
- Subjects
- Antineoplastic Agents pharmacology, Clinical Trials, Phase III as Topic, Drug Industry, Forecasting, Humans, Models, Biological, Sensitivity and Specificity, Antineoplastic Agents economics, Bayes Theorem, Economics, Pharmaceutical
- Abstract
Objective: To evaluate the economic impact of a Bayesian network model designed to predict clinical success of a new chemical entity (NCE) based on pre-phase III data., Methods: We trained our Bayesian network model on publicly accessible data on 503 NCEs, stratified by therapeutic class. We evaluated the sensitivity, specificity and accuracy of our model on an independent data set of 18 NCE-indication pairs, using prior probability data for the antineoplastic NCEs within the training set. We performed Monte Carlo simulations to evaluate the economic performance of our model relative to reported pharmaceutical industry performance, taking into account reported capitalized phase costs, cumulative revenues for a postapproval period of 7 years, and the range of possible false negative and true negative rates for terminated NCEs within the pharmaceutical industry., Results: Our model predicted outcomes on the independent validation set of oncology agents with 78% accuracy (80%sensitivity and 76% specificity). In comparison with the pharmaceutical industry's reported success rates, on average our model significantly reduced capitalized expenditures from $727 million/successful NCE to $444 million/successful NCE (P < 0.001), and significantly improved revenues from $347 million/phase III trial to $507 million/phase III trial (P < 0.001) during the first 7 years post launch. These results indicate that our model identified successful NCEs more efficiently than currently reported pharmaceutical industry performances., Conclusions: Accurate prediction of NCE outcomes is computationally feasible, significantly increasing the proportion of successful NCEs, and likely eliminating ineffective and unsafe NCEs.
- Published
- 2007
- Full Text
- View/download PDF
25. Bayesian methods for proteomics.
- Author
-
Alterovitz G, Liu J, Afkhami E, and Ramoni MF
- Subjects
- Models, Molecular, Peptides chemistry, Phylogeny, Signal Transduction, Bayes Theorem, Proteomics
- Abstract
Biological and medical data have been growing exponentially over the past several years [1, 2]. In particular, proteomics has seen automation dramatically change the rate at which data are generated [3]. Analysis that systemically incorporates prior information is becoming essential to making inferences about the myriad, complex data [4-6]. A Bayesian approach can help capture such information and incorporate it seamlessly through a rigorous, probabilistic framework. This paper starts with a review of the background mathematics behind the Bayesian methodology: from parameter estimation to Bayesian networks. The article then goes on to discuss how emerging Bayesian approaches have already been successfully applied to research across proteomics, a field for which Bayesian methods are particularly well suited [7-9]. After reviewing the literature on the subject of Bayesian methods in biological contexts, the article discusses some of the recent applications in proteomics and emerging directions in the field.
- Published
- 2007
- Full Text
- View/download PDF
26. Transcription factor CHF1/Hey2 regulates the global transcriptional response to platelet-derived growth factor in vascular smooth muscle cells.
- Author
-
Shirvani SM, Mookanamparambil L, Ramoni MF, and Chin MT
- Subjects
- Animals, Cells, Cultured, Mice, Mice, Knockout, Muscle, Smooth, Vascular cytology, Muscle, Smooth, Vascular drug effects, Muscle, Smooth, Vascular metabolism, Mutation, Myocytes, Smooth Muscle cytology, Myocytes, Smooth Muscle metabolism, Oligonucleotide Array Sequence Analysis, Reverse Transcriptase Polymerase Chain Reaction, Basic Helix-Loop-Helix Transcription Factors genetics, Myocytes, Smooth Muscle drug effects, Platelet-Derived Growth Factor pharmacology, Repressor Proteins genetics, Transcription, Genetic drug effects
- Abstract
The cardiovascular restricted transcription factor CHF1/Hey2 has been previously shown to regulate the smooth muscle response to growth factors. To determine how CHF1/Hey2 affects the smooth muscle response to growth factors, we performed a genomic screen for transcripts that are differentially expressed in wild-type and knockout smooth muscle cells after stimulation with platelet-derived growth factor. We screened 45,101 probes representing >39,000 transcripts derived from at least 34,000 genes, at eight different time points. We analyzed the expression data utilizing an algorithm based on Bayesian statistics to derive the best polynomial clustering model to fit the expression data. We found that in a total of 9,827 transcripts the normalized ratio of knockout to wild-type expression diverged more than threefold from baseline in at least one time point, and these transcripts separated into 17 distinct clusters. Further analysis of each cluster revealed distinct alterations in gene expression patterns for immediate early genes, transcription factors, matrix metalloproteinases, signaling molecules, and other molecules important in vascular biology. Our findings demonstrate that CHF1/Hey2 profoundly affects vascular smooth muscle phenotype by altering both the absolute expression level of a variety of genes and the kinetics of growth factor-induced gene expression.
- Published
- 2007
- Full Text
- View/download PDF
27. Paediatric drug development.
- Author
-
Schachter AD and Ramoni MF
- Subjects
- Child, Clinical Trials as Topic, Drug Approval, Drug Industry, Drug Labeling, Humans, Legislation, Drug, Drug Design, Pediatrics
- Published
- 2007
- Full Text
- View/download PDF
28. Bayesian approaches to reverse engineer cellular systems: a simulation study on nonlinear Gaussian networks.
- Author
-
Ferrazzi F, Sebastiani P, Ramoni MF, and Bellazzi R
- Subjects
- Computer Simulation, Normal Distribution, Saccharomycetales cytology, Bayes Theorem, Cell Cycle physiology, Models, Biological, Nonlinear Dynamics, Saccharomycetales physiology
- Abstract
Background: Reverse engineering cellular networks is currently one of the most challenging problems in systems biology. Dynamic Bayesian networks (DBNs) seem to be particularly suitable for inferring relationships between cellular variables from the analysis of time series measurements of mRNA or protein concentrations. As evaluating inference results on a real dataset is controversial, the use of simulated data has been proposed. However, DBN approaches that use continuous variables, thus avoiding the information loss associated with discretization, have not yet been extensively assessed, and most of the proposed approaches have dealt with linear Gaussian models., Results: We propose a generalization of dynamic Gaussian networks to accommodate nonlinear dependencies between variables. As a benchmark dataset to test the new approach, we used data from a mathematical model of cell cycle control in budding yeast that realistically reproduces the complexity of a cellular system. We evaluated the ability of the networks to describe the dynamics of cellular systems and their precision in reconstructing the true underlying causal relationships between variables. We also tested the robustness of the results by analyzing the effect of noise on the data, and the impact of a different sampling time., Conclusion: The results confirmed that DBNs with Gaussian models can be effectively exploited for a first level analysis of data from complex cellular systems. The inferred models are parsimonious and have a satisfying goodness of fit. Furthermore, the networks not only offer a phenomenological description of the dynamics of cellular systems, but are also able to suggest hypotheses concerning the causal interactions between variables. The proposed nonlinear generalization of Gaussian models yielded models characterized by a slightly lower goodness of fit than the linear model, but a better ability to recover the true underlying connections between variables.
- Published
- 2007
- Full Text
- View/download PDF
29. Geography and genography: prediction of continental origin using randomly selected single nucleotide polymorphisms.
- Author
-
Allocco DJ, Song Q, Gibbons GH, Ramoni MF, and Kohane IS
- Subjects
- Analysis of Variance, Asian People genetics, Bayes Theorem, Black People genetics, Databases, Genetic, Humans, Models, Genetic, White People genetics, Genetic Variation, Genetics, Population, Genome, Human genetics, Geography, Polymorphism, Single Nucleotide genetics
- Abstract
Background: Recent studies have shown that when individuals are grouped on the basis of genetic similarity, group membership corresponds closely to continental origin. There has been considerable debate about the implications of these findings in the context of larger debates about race and the extent of genetic variation between groups. Some have argued that clustering according to continental origin demonstrates the existence of significant genetic differences between groups and that these differences may have important implications for differences in health and disease. Others argue that clustering according to continental origin requires the use of large amounts of genetic data or specifically chosen markers and is indicative only of very subtle genetic differences that are unlikely to have biomedical significance., Results: We used small numbers of randomly selected single nucleotide polymorphisms (SNPs) from the International HapMap Project to train naïve Bayes classifiers for prediction of ancestral continent of origin. Predictive accuracy was tested on two independent data sets. Genetically similar groups should be difficult to distinguish, especially if only a small number of genetic markers are used. The genetic differences between continentally defined groups are sufficiently large that one can accurately predict ancestral continent of origin using only a minute, randomly selected fraction of the genetic variation present in the human genome. Genotype data from only 50 random SNPs was sufficient to predict ancestral continent of origin in our primary test data set with an average accuracy of 95%. Genetic variations informative about ancestry were common and widely distributed throughout the genome., Conclusion: Accurate characterization of ancestry is possible using small numbers of randomly selected SNPs. The results presented here show how investigators conducting genetic association studies can use small numbers of arbitrarily chosen SNPs to identify stratification in study subjects and avoid false positive genotype-phenotype associations. Our findings also demonstrate the extent of variation between continentally defined groups and argue strongly against the contention that genetic differences between groups are too small to have biomedical significance.
- Published
- 2007
- Full Text
- View/download PDF
30. A novel, single nucleotide polymorphism-based assay to detect 22q11 deletions.
- Author
-
Funke BH, Brown AC, Ramoni MF, Regan ME, Baglieri C, Finn CT, Babcock M, Shprintzen RJ, Morrow BE, and Kucherlapati R
- Subjects
- Base Sequence, Bayes Theorem, DNA, Humans, In Situ Hybridization, Fluorescence, Molecular Sequence Data, Sensitivity and Specificity, Chromosome Deletion, Chromosomes, Human, Pair 22, Polymorphism, Single Nucleotide
- Abstract
Velocardiofacial syndrome, DiGeorge syndrome, and conotruncal anomaly face syndrome, now collectively referred to as 22q11deletion syndrome (22q11DS) are caused by microdeletions on chromosome 22q11. The great majority ( approximately 90%) of these deletions are 3 Mb in size. The remaining deleted patients have nested break-points resulting in overlapping regions of hemizygosity. Diagnostic testing for the disorder is traditionally done by fluorescent in situ hybridization (FISH) using probes located in the proximal half of the region common to all deletions. We developed a novel, high-resolution single-nucleotide polymorphism (SNP) genotyping assay to detect 22q11 deletions. We validated this assay using DNA from 110 nondeleted controls and 77 patients with 22q11DS that had previously been tested by FISH. The assay was 100% sensitive (all deletions were correctly identified). Our assay was also able to detect a case of segmental uniparental disomy at 22q11 that was not detected by the FISH assay. We used Bayesian networks to identify a set of 17 SNPs that are sufficient to ascertain unambiguously the deletion status of 22q11DS patients. Our SNP based assay is a highly accurate, sensitive, and specific method for the diagnosis of 22q11 deletion syndrome.
- Published
- 2007
- Full Text
- View/download PDF
31. Clinical forecasting in drug development.
- Author
-
Schachter AD and Ramoni MF
- Subjects
- Bayes Theorem, Drug Industry economics, Humans, Models, Theoretical, Clinical Trials as Topic, Forecasting, Pharmaceutical Preparations economics
- Published
- 2007
- Full Text
- View/download PDF
32. Serum proteome profiling detects myelodysplastic syndromes and identifies CXC chemokine ligands 4 and 7 as markers for advanced disease.
- Author
-
Aivado M, Spentzos D, Germing U, Alterovitz G, Meng XY, Grall F, Giagounidis AA, Klement G, Steidl U, Otu HH, Czibere A, Prall WC, Iking-Konert C, Shayne M, Ramoni MF, Gattermann N, Haas R, Mitsiades CS, Fung ET, and Libermann TA
- Subjects
- Humans, Mass Spectrometry, Biomarkers metabolism, Blood Proteins chemistry, Chemokines, CXC metabolism, Myelodysplastic Syndromes blood, Proteome
- Abstract
Myelodysplastic syndromes (MDS) are among the most frequent hematologic malignancies. Patients have a short survival and often progress to acute myeloid leukemia. The diagnosis of MDS can be difficult; there is a paucity of molecular markers, and the pathophysiology is largely unknown. Therefore, we conducted a multicenter study investigating whether serum proteome profiling may serve as a noninvasive platform to discover novel molecular markers for MDS. We generated serum proteome profiles from 218 individuals by MS and identified a profile that distinguishes MDS from non-MDS cytopenias in a learning sample set. This profile was validated by testing its ability to predict MDS in a first independent validation set and a second, prospectively collected, independent validation set run 5 months apart. Accuracy was 80.5% in the first and 79.0% in the second validation set. Peptide mass fingerprinting and quadrupole TOF MS identified two differential proteins: CXC chemokine ligands 4 (CXCL4) and 7 (CXCL7), both of which had significantly decreased serum levels in MDS, as confirmed with independent antibody assays. Western blot analyses of platelet lysates for these two platelet-derived molecules revealed a lack of CXCL4 and CXCL7 in MDS. Subtype analyses revealed that these two proteins have decreased serum levels in advanced MDS, suggesting the possibility of a concerted disturbance of transcription or translation of these chemokines in advanced MDS.
- Published
- 2007
- Full Text
- View/download PDF
33. GO PaD: the Gene Ontology Partition Database.
- Author
-
Alterovitz G, Xiang M, Mohan M, and Ramoni MF
- Subjects
- Computer Graphics, Humans, Internet, User-Computer Interface, Vocabulary, Controlled, Databases, Genetic, Genes physiology
- Abstract
Gene Ontology (GO) has been widely used to infer functional significance associated with sets of genes in order to automate discoveries within large-scale genetic studies. A level in GO's direct acyclic graph structure is often assumed to be indicative of its terms' specificities, although other work has suggested this assumption does not hold. Unfortunately, quantitative analysis of biological functions based on nodes at the same level (as is common in gene enrichment analysis tools) can lead to incorrect conclusions as well as missed discoveries due to inefficient use of available information. This paper addresses these using an informational theoretic approach encoded in the GO Partition Database that guarantees to maximize information for gene enrichment analysis. The GO Partition Database was designed to feature ontology partitions with GO terms of similar specificity. The GO partitions comprise varying numbers of nodes and present relevant information theoretic statistics, so researchers can choose to analyze datasets at arbitrary levels of specificity. The GO Partition Database, featuring GO partition sets for functional analysis of genes from human and 10 other commonly studied organisms with a total of 131,972 genes, is available on the internet at: bcl.med.harvard.edu/proj/gopart. The site also includes an online tutorial.
- Published
- 2007
- Full Text
- View/download PDF
34. Regulation of myogenic progenitor proliferation in human fetal skeletal muscle by BMP4 and its antagonist Gremlin.
- Author
-
Frank NY, Kho AT, Schatton T, Murphy GF, Molloy MJ, Zhan Q, Ramoni MF, Frank MH, Kohane IS, and Gussoni E
- Subjects
- Bone Morphogenetic Protein 4, Bone Morphogenetic Protein Receptors, Type I metabolism, Bone Morphogenetic Proteins antagonists & inhibitors, Cell Proliferation, Cells, Cultured, Fetus, Humans, Muscle Fibers, Skeletal metabolism, Myoblasts, Skeletal metabolism, Bone Morphogenetic Proteins physiology, Intercellular Signaling Peptides and Proteins physiology, Muscle Fibers, Skeletal cytology, Myoblasts, Skeletal cytology
- Abstract
Skeletal muscle side population (SP) cells are thought to be "stem"-like cells. Despite reports confirming the ability of muscle SP cells to give rise to differentiated progeny in vitro and in vivo, the molecular mechanisms defining their phenotype remain unclear. In this study, gene expression analyses of human fetal skeletal muscle demonstrate that bone morphogenetic protein 4 (BMP4) is highly expressed in SP cells but not in main population (MP) mononuclear muscle-derived cells. Functional studies revealed that BMP4 specifically induces proliferation of BMP receptor 1a-positive MP cells but has no effect on SP cells, which are BMPR1a-negative. In contrast, the BMP4 antagonist Gremlin, specifically up-regulated in MP cells, counteracts the stimulatory effects of BMP4 and inhibits proliferation of BMPR1a-positive muscle cells. In vivo, BMP4-positive cells can be found in the proximity of BMPR1a-positive cells in the interstitial spaces between myofibers. Gremlin is expressed by mature myofibers and interstitial cells, which are separate from BMP4-expressing cells. Together, these studies propose that BMP4 and Gremlin, which are highly expressed by human fetal skeletal muscle SP and MP cells, respectively, are regulators of myogenic progenitor proliferation.
- Published
- 2006
- Full Text
- View/download PDF
35. Melanoma cell adhesion molecule is a novel marker for human fetal myogenic cells and affects myoblast fusion.
- Author
-
Cerletti M, Molloy MJ, Tomczak KK, Yoon S, Ramoni MF, Kho AT, Beggs AH, and Gussoni E
- Subjects
- Adult, Animals, CD146 Antigen genetics, Cell Fractionation, Cells, Cultured, Endothelial Cells metabolism, Female, Gene Expression Profiling, Gestational Age, Humans, Myoblasts cytology, Oligonucleotide Array Sequence Analysis, Pregnancy, RNA Interference, Biomarkers metabolism, CD146 Antigen metabolism, Cell Fusion, Fetus anatomy & histology, Muscle, Skeletal cytology, Muscle, Skeletal embryology, Muscle, Skeletal physiology, Myoblasts physiology
- Abstract
Myoblast fusion is a highly regulated process that is important during muscle development and myofiber repair and is also likely to play a key role in the incorporation of donor cells in myofibers for cell-based therapy. Although several proteins involved in muscle cell fusion in Drosophila are known, less information is available on the regulation of this process in vertebrates, including humans. To identify proteins that are regulated during fusion of human myoblasts, microarray studies were performed on samples obtained from human fetal skeletal muscle of seven individuals. Primary muscle cells were isolated, expanded, induced to fuse in vitro, and gene expression comparisons were performed between myoblasts and early or late myotubes. Among the regulated genes, melanoma cell adhesion molecule (M-CAM) was found to be significantly downregulated during human fetal muscle cell fusion. M-CAM expression was confirmed on activated myoblasts, both in vitro and in vivo, and on myoendothelial cells (M-CAM(+) CD31(+)), which were positive for the myogenic markers desmin and MyoD. Lastly, in vitro functional studies using M-CAM RNA knockdown demonstrated that inhibition of M-CAM expression enhances myoblast fusion. These studies identify M-CAM as a novel marker for myogenic progenitors in human fetal muscle and confirm that downregulation of this protein promotes myoblast fusion.
- Published
- 2006
- Full Text
- View/download PDF
36. Automation, parallelism, and robotics for proteomics.
- Author
-
Alterovitz G, Liu J, Chow J, and Ramoni MF
- Subjects
- Animals, Automation, Electrophoresis, Agar Gel, Mass Spectrometry, Mice, Protein Array Analysis, Proteomics instrumentation, Robotics instrumentation, Proteins chemistry, Proteomics methods, Robotics methods
- Abstract
The speed of the human genome project (Lander, E. S., Linton, L. M., Birren, B., Nusbaum, C. et al., Nature 2001, 409, 860-921) was made possible, in part, by developments in automation of sequencing technologies. Before these technologies, sequencing was a laborious, expensive, and personnel-intensive task. Similarly, automation and robotics are changing the field of proteomics today. Proteomics is defined as the effort to understand and characterize proteins in the categories of structure, function and interaction (Englbrecht, C. C., Facius, A., Comb. Chem. High Throughput Screen. 2005, 8, 705-715). As such, this field nicely lends itself to automation technologies since these methods often require large economies of scale in order to achieve cost and time-saving benefits. This article describes some of the technologies and methods being applied in proteomics in order to facilitate automation within the field as well as in linking proteomics-based information with other related research areas.
- Published
- 2006
- Full Text
- View/download PDF
37. A Bayesian dynamic model for influenza surveillance.
- Author
-
Sebastiani P, Mandl KD, Szolovits P, Kohane IS, and Ramoni MF
- Subjects
- Child, Preschool, Disease Outbreaks, Humans, Infant, Influenza, Human prevention & control, Influenza, Human virology, United States, Bayes Theorem, Influenza A Virus, H5N1 Subtype growth & development, Influenza, Human epidemiology, Models, Biological, Models, Statistical, Population Surveillance methods
- Abstract
The severe acute respiratory syndrome (SARS) epidemic, the growing fear of an influenza pandemic and the recent shortage of flu vaccine highlight the need for surveillance systems able to provide early, quantitative predictions of epidemic events. We use dynamic Bayesian networks to discover the interplay among four data sources that are monitored for influenza surveillance. By integrating these different data sources into a dynamic model, we identify in children and infants presenting to the pediatric emergency department with respiratory syndromes an early indicator of impending influenza morbidity and mortality. Our findings show the importance of modelling the complex dynamics of data collected for influenza surveillance, and suggest that dynamic Bayesian networks could be suitable modelling tools for developing epidemic surveillance systems., (Copyright 2006 John Wiley & Sons, Ltd.)
- Published
- 2006
- Full Text
- View/download PDF
38. The gene expression profile in refractory periodontitis patients.
- Author
-
Kim DM, Ramoni MF, Nevins M, and Fiorellini JP
- Subjects
- Aged, Bayes Theorem, Down-Regulation genetics, Female, Humans, Male, Middle Aged, Mouth Mucosa immunology, Periodontitis immunology, RNA isolation & purification, Reverse Transcriptase Polymerase Chain Reaction methods, Up-Regulation genetics, Gene Expression genetics, Gene Expression Profiling methods, Periodontitis genetics
- Abstract
Background: There are no specific bacterial profiles or diagnostic tests capable of identifying refractory periodontitis patients before a treatment regimen is initiated. Therefore, in this high-risk cohort of patients who do not respond appropriately, host factors that might be partly under genetic control may play a crucial role in their susceptibility. Specifically, we tested the hypothesis that patients with refractory periodontitis have multiple upregulated and/or downregulated genes that might be important in influencing clinical risk., Methods: Oral subepithelial connective tissues were harvested aseptically from seven refractory periodontitis and seven periodontally well-maintained patients. An RNA isolation kit was used to isolate total RNA from tissue samples that had been stabilized in the RNA stabilizing reagent. The isolated total RNA was then subjected to gene expression profiling using the microarray to measure gene expression levels. The retrieved data were analyzed with a computer program for the differential analysis of gene expression microarray experiments. In addition, real-time polymerase chain reaction (PCR) analysis was performed on selected samples to confirm the microarray data's gene expression patterns., Results: A total of 68 upregulated and six downregulated genes were identified that were differentially expressed at least two-fold out of 22,283 genes we analyzed. The selected model provided a 93% intrinsic validation along with a 93% extrinsic validation. To validate the microarray data, five upregulated genes (lactotransferrin [LTF], matrix metalloproteinase-1 [MMP-1], MMP-3, interferon induced-15 [IFI-15], and Homo sapiens hypothetical protein MGC5566) and two downregulated genes (keratin 2A [KRT2A] and desmocollin-1 [DSC-1]) were randomly selected for further analysis by real-time PCR. The relative RNA expression level of these genes measured by real-time PCR was similar to those measured by microarrays., Conclusion: The combined use of microarray technology with the computer program for the differential analysis of gene expression microarray experiments provided a set of candidate genes that may serve as novel therapeutic intervention points and improved diagnostic and screening procedures for high-risk individuals.
- Published
- 2006
- Full Text
- View/download PDF
39. SELDI-TOF MS of quadruplicate urine and serum samples to evaluate changes related to storage conditions.
- Author
-
Traum AZ, Wells MP, Aivado M, Libermann TA, Ramoni MF, and Schachter AD
- Subjects
- Cryopreservation methods, Multicenter Studies as Topic, Time Factors, Albumins analysis, Blood Proteins analysis, Spectrometry, Mass, Matrix-Assisted Laser Desorption-Ionization, Urine chemistry
- Abstract
Proteomic profiling with SELDI-TOF MS has facilitated the discovery of disease-specific protein profiles. However, multicenter studies are often hindered by the logistics required for prompt deep-freezing of samples in liquid nitrogen or dry ice within the clinic setting prior to shipping. We report high concordance between MS profiles within sets of quadruplicate split urine and serum samples deep-frozen at 0, 2, 6, and 24 h after sample collection. Gage R&R results confirm that deep-freezing times are not a statistically significant source of SELDI-TOF MS variability for either blood or urine.
- Published
- 2006
- Full Text
- View/download PDF
40. Discovering biological guilds through topological abstraction.
- Author
-
Alterovitz G and Ramoni MF
- Subjects
- Models, Biological, Escherichia coli genetics, Gene Expression Regulation, Bacterial, Gene Regulatory Networks
- Abstract
High-throughput generation of new types of relational biological datasets is creating a demand for methods to provide insights into their complexity. Such networks are often too large to interpret visually and too complicated to be explained solely based on local topological properties. One way to try to make sense of such complex networks would be to transform them into discernable abstracts, or summaries, of the original networks. Then, important components could become more readily visible. This work presents such an approach for understanding networks via abstraction of global network connectivity using compression. This made possible the discovery of a new type of topological class, referred to herein as a guild, that captures global connectivity similarity. Lastly, the correspondence of these guilds to biological function is validated via an E. Coli gene regulation network. This resulted in biological findings that could not be derived from local topology of the original network.
- Published
- 2006
41. Factors affecting automated syndromic surveillance.
- Author
-
Wang L, Ramoni MF, Mandl KD, and Sebastiani P
- Subjects
- Automation, Epidemiologic Measurements, Humans, Models, Statistical, Population Surveillance, Reproducibility of Results, Disease Outbreaks, Syndrome
- Abstract
Objective: The increased threat of bioterroristic attacks and epidemic events requires the development of accurate and timely outbreak detection systems for early identification of anomalies in public health data., Material and Methods: We propose an automated outbreak detection system based on syndromic data. This system uses an autoregressive model with seasonal components to monitor, online, the daily counts of chief complaints for respiratory syndromes at the emergency department of two major metropolitan hospitals. We evaluate this system by estimating the false positive rate in real data under the assumption that there were no outbreaks of disease, and the true positive rate in real baseline data in which we injected stochastically simulated outbreaks of different shape and size. We then use directed graphical models to account for the effect of exogenous factors on the detection performance of the system., Results: Our study shows that for a week-long outbreak, our model has an overall 84.8% true detection accuracy across all shapes of outbreaks, while the outbreak size influences the earliness to detection. The false and true positive rates are also associated with the exogenous factors and knowledge about these factors can help to improve the detection accuracy., Conclusion: This study suggests that the integration of multiple data sources can significantly improve the detection accuracy of syndromic surveillance systems.
- Published
- 2005
- Full Text
- View/download PDF
42. Genetic dissection and prognostic modeling of overt stroke in sickle cell anemia.
- Author
-
Sebastiani P, Ramoni MF, Nolan V, Baldwin CT, and Steinberg MH
- Subjects
- Fetal Hemoglobin genetics, Genetic Markers, Genetic Predisposition to Disease, Genotype, Humans, Magnetic Resonance Imaging, Prognosis, Risk Factors, Signal Transduction, Anemia, Sickle Cell genetics, Fetal Hemoglobin metabolism, Hemoglobin, Sickle genetics, Models, Genetic, Polymorphism, Single Nucleotide, Stroke genetics
- Abstract
Sickle cell anemia (SCA) is a paradigmatic single gene disorder caused by homozygosity with respect to a unique mutation at the beta-globin locus. SCA is phenotypically complex, with different clinical courses ranging from early childhood mortality to a virtually unrecognized condition. Overt stroke is a severe complication affecting 6-8% of individuals with SCA. Modifier genes might interact to determine the susceptibility to stroke, but such genes have not yet been identified. Using Bayesian networks, we analyzed 108 SNPs in 39 candidate genes in 1,398 individuals with SCA. We found that 31 SNPs in 12 genes interact with fetal hemoglobin to modulate the risk of stroke. This network of interactions includes three genes in the TGF-beta pathway and SELP, which is associated with stroke in the general population. We validated this model in a different population by predicting the occurrence of stroke in 114 individuals with 98.2% accuracy.
- Published
- 2005
- Full Text
- View/download PDF
43. Optimization and evaluation of surface-enhanced laser desorption/ionization time-of-flight mass spectrometry (SELDI-TOF MS) with reversed-phase protein arrays for protein profiling.
- Author
-
Aivado M, Spentzos D, Alterovitz G, Otu HH, Grall F, Giagounidis AA, Wells M, Cho JY, Germing U, Czibere A, Prall WC, Porter C, Ramoni MF, and Libermann TA
- Subjects
- Lasers, Reproducibility of Results, Protein Array Analysis methods, Spectrometry, Mass, Matrix-Assisted Laser Desorption-Ionization methods
- Abstract
Surface-enhanced laser desorption/ionization (SELDI) time-of-flight mass spectrometry with protein arrays has facilitated the discovery of disease-specific protein profiles in serum. Such results raise hopes that protein profiles may become a powerful diagnostic tool. To this end, reliable and reproducible protein profiles need to be generated from many samples, accurate mass peak heights are necessary, and the experimental variation of the profiles must be known. We adapted the entire processing of protein arrays to a robotics system, thus improving the intra-assay coefficients of variation (CVs) from 45.1% to 27.8% (p<0.001). In addition, we assessed up to 16 technical replicates, and demonstrated that analysis of 2-4 replicates significantly increases the reliability of the protein profiles. A recent report on limited long-term reproducibility seemed to concord with our initial inter-assay CVs, which varied widely and reached up to 56.7%. However, we discovered that the inter-assay CV is strongly dependent on the drying time before application of the matrix molecule. Therefore, we devised a standardized drying process and demonstrated that our optimized SELDI procedure generates reliable and long-term reproducible protein profiles with CVs ranging from 25.7% to 32.6%, depending on the signal-to-noise ratio threshold used.
- Published
- 2005
- Full Text
- View/download PDF
44. Robust transmission/disequilibrium test for incomplete family genotypes.
- Author
-
Sebastiani P, Abad MM, Alpargu G, and Ramoni MF
- Subjects
- Crohn Disease genetics, Data Interpretation, Statistical, Genetic Markers, Genotype, Polymorphism, Single Nucleotide, Linkage Disequilibrium
- Abstract
Several solutions have been proposed to extend the transmission disequilibrium test (TDT) to include cases with missing parental genotype. However, completion of the missing parental genotype may bias the test if the underlying missing data mechanism is informative. Furthermore, all these solutions resolve the problem of missing parental genotype, while offspring with missing genotypes are typically ignored. We propose here an extension to the TDT, called robust TDT (rTDT), able to handle incomplete genotypes on both parents and children and that does not rest on any assumption about the missing data mechanism. rTDT returns minimum and maximum values of TDT that are consistent with all the possible completions of the missing data. We also show that, in some situations, rTDT can achieve both greater power and greater significance than the popular TDT analysis of incomplete data. rTDT is applied to a database of markers of susceptibility to Crohn's disease and it shows that only 2 of the 11 markers originally associated with the phenotype do not depend on assumptions about the missing data mechanism.
- Published
- 2004
- Full Text
- View/download PDF
45. Gene expression signature with independent prognostic significance in epithelial ovarian cancer.
- Author
-
Spentzos D, Levine DA, Ramoni MF, Joseph M, Gu X, Boyd J, Libermann TA, and Cannistra SA
- Subjects
- Adult, Aged, Biomarkers, Tumor metabolism, Biopsy, Needle, Chemotherapy, Adjuvant, Combined Modality Therapy, DNA, Complementary analysis, Female, Gene Expression Regulation, Neoplastic, Humans, Immunohistochemistry, Middle Aged, Neoplasm Staging, Neoplasms, Glandular and Epithelial pathology, Neoplasms, Glandular and Epithelial therapy, Ovarian Neoplasms pathology, Ovarian Neoplasms therapy, Ovariectomy methods, Predictive Value of Tests, Prognosis, RNA, Neoplasm analysis, Risk Assessment, Sensitivity and Specificity, Survival Analysis, Treatment Outcome, Genetic Predisposition to Disease, Neoplasms, Glandular and Epithelial genetics, Neoplasms, Glandular and Epithelial mortality, Ovarian Neoplasms genetics, Ovarian Neoplasms mortality
- Abstract
Purpose: Currently available clinical and molecular prognostic factors provide an imperfect assessment of prognosis for patients with epithelial ovarian cancer (EOC). In this study, we investigated whether tumor transcription profiling could be used as a prognostic tool in this disease., Methods: Tumor tissue from 68 patients was profiled with oligonucleotide microarrays. Samples were randomly split into training and validation sets. A three-step training procedure was used to discover a statistically significant Kaplan-Meier split in the training set. The resultant prognostic signature was then tested on an independent validation set for confirmation., Results: In the training set, a 115-gene signature referred to as the Ovarian Cancer Prognostic Profile (OCPP) was identified. When applied to the validation set, the OCPP distinguished between patients with unfavorable and favorable overall survival (median, 30 months v not yet reached, respectively; log-rank P = .004). The signature maintained independent prognostic value in multivariate analysis, controlling for other known prognostic factors such as age, stage, grade, and debulking status. The hazard ratio for death in the unfavorable OCPP group was 4.8 (P = .021 by Cox proportional hazards analysis)., Conclusion: The OCPP is an independent prognostic determinant of outcome in EOC. The use of gene profiling may ultimately permit identification of EOC patients appropriate for investigational treatment approaches, based on a low likelihood of achieving prolonged survival with standard first-line platinum-based therapy.
- Published
- 2004
- Full Text
- View/download PDF
46. Bayesian approach to discovering pathogenic SNPs in conserved protein domains.
- Author
-
Cai Z, Tsung EF, Marinescu VD, Ramoni MF, Riva A, and Kohane IS
- Subjects
- Algorithms, Computational Biology methods, Computational Biology statistics & numerical data, Databases, Genetic, Humans, Linkage Disequilibrium genetics, Models, Genetic, Mutation, Missense genetics, Neural Networks, Computer, Predictive Value of Tests, Protein Structure, Tertiary genetics, Transcription Factors genetics, Bayes Theorem, Conserved Sequence genetics, Peptides genetics, Polymorphism, Single Nucleotide genetics
- Abstract
The success rate of association studies can be improved by selecting better genetic markers for genotyping or by providing better leads for identifying pathogenic single nucleotide polymorphisms (SNPs) in the regions of linkage disequilibrium with positive disease associations. We have developed a novel algorithm to predict pathogenic single amino acid changes, either nonsynonymous SNPs (nsSNPs) or missense mutations, in conserved protein domains. Using a Bayesian framework, we found that the probability of a microbial missense mutation causing a significant change in phenotype depended on how much difference it made in several phylogenetic, biochemical, and structural features related to the single amino acid substitution. We tested our model on pathogenic allelic variants (missense mutations or nsSNPs) included in OMIM, and on the other nsSNPs in the same genes (from dbSNP) as the nonpathogenic variants. As a result, our model predicted pathogenic variants with a 10% false-positive rate. The high specificity of our prediction algorithm should make it valuable in genetic association studies aimed at identifying pathogenic SNPs., (Copyright 2004 Wiley-Liss, Inc.)
- Published
- 2004
- Full Text
- View/download PDF
47. Identification of a transcriptional profile associated with in vitro invasion in non-small cell lung cancer cell lines.
- Author
-
Lader AS, Ramoni MF, Zetter BR, Kohane IS, and Kwiatkowski DJ
- Subjects
- Adenocarcinoma genetics, Adenocarcinoma metabolism, Adenocarcinoma secondary, Biomarkers, Tumor, Carcinoma, Non-Small-Cell Lung genetics, Carcinoma, Non-Small-Cell Lung pathology, Carcinoma, Squamous Cell genetics, Carcinoma, Squamous Cell metabolism, Carcinoma, Squamous Cell pathology, Cell Adhesion, Collagen metabolism, Drug Combinations, Humans, Laminin metabolism, Lung Neoplasms genetics, Lung Neoplasms metabolism, Lung Neoplasms pathology, Oligonucleotide Array Sequence Analysis, Proteoglycans metabolism, Tumor Cells, Cultured, Carcinoma, Non-Small-Cell Lung metabolism, Gene Expression Profiling, Gene Expression Regulation, Neoplastic, Neoplasm Invasiveness
- Abstract
Although much has been learned about basic mechanisms of cell invasion, the genes whose expression is required for this process by malignant cell lines have remained obscure. We assessed invasion through Matrigel using EGF as a chemoattractant and gene expression profiles using oligonucleotide microarrays for 22 non-small cell lung cancer cell lines. The expression of 22 genes were significantly correlated (p < 0.001) with the measured invasion index. Cluster analysis demonstrated that gene expression profiles classify the cell lines into low and high invasive subgroups. Considering invasiveness as a dichotomous variable, Bayesian analysis was used to identify genes that have the highest probability of being differentially expressed between the high and low invasion groups. This analysis identified 16 genes whose expression was associated with invasiveness. "Leave one out" cross validation was 91% accurate. Nine genes were identified in both correlation and Bayesian analyses. Seven of the nine genes were negatively associated with invasion and four of those genes are plasma membrane proteins. The two genes with the highest inverse association with invasion, TACSTD1 and CLDN3, are involved with cell adhesion and cell-cell interactions, respectively. Interestingly, the gene with the highest positive association with invasion, SERPINE1 (PAI-1), is a protease inhibitor. These and the other genes identified by both analyses represent targets for further study to assess their importance in non-small cell lung cancer invasion and metastasis.
- Published
- 2004
- Full Text
- View/download PDF
48. Expression profiling and identification of novel genes involved in myogenic differentiation.
- Author
-
Tomczak KK, Marinescu VD, Ramoni MF, Sanoudou D, Montanaro F, Han M, Kunkel LM, Kohane IS, and Beggs AH
- Subjects
- Animals, Cell Cycle genetics, Cell Division genetics, Cell Line, Cluster Analysis, Down-Regulation, Expressed Sequence Tags, Mice, Muscle Development, Muscles cytology, Muscles metabolism, RNA, Messenger genetics, RNA, Messenger metabolism, Transcription, Genetic, Cell Differentiation genetics, Gene Expression Profiling, Genes genetics, Myoblasts cytology, Myoblasts metabolism
- Abstract
Skeletal muscle differentiation is a complex, highly coordinated process that relies on precise temporal gene expression patterns. To better understand this cascade of transcriptional events, we used expression profiling to analyze gene expression in a 12-day time course of differentiating C2C12 myoblasts. Cluster analysis specific for time-ordered microarray experiments classified 2895 genes and ESTs with variable expression levels between proliferating and differentiating cells into 22 clusters with distinct expression patterns during myogenesis. Expression patterns for several known and novel genes were independently confirmed by real-time quantitative RT-PCR and/or Western blotting and immunofluorescence. MyoD and MEF family members exhibited unique expression kinetics that were highly coordinated with cell-cycle withdrawal regulators. Among genes with peak expression levels during cell cycle withdrawal were Vcam1, Itgb3, Itga5, Vcl, as well as Ptger4, a gene not previously associated with the process of myogenesis. One interesting uncharacterized transcript that is highly induced during myogenesis encodes several immunoglobulin repeats with sequence similarity to titin, a large sarcomeric protein. These data sets identify many additional uncharacterized transcripts that may play important functions in muscle cell proliferation and differentiation and provide a baseline for comparison with C2C12 cells expressing various mutant genes involved in myopathic disorders.
- Published
- 2004
- Full Text
- View/download PDF
49. Bayesian machine learning and its potential applications to the genomic study of oral oncology.
- Author
-
Sebastiani P, Yu YH, and Ramoni MF
- Subjects
- Blood, Cluster Analysis, Computational Biology methods, DNA, Neoplasm analysis, Fibroblasts, Gene Expression Profiling, Gene Expression Regulation, Neoplastic, Humans, Oligonucleotide Array Sequence Analysis, Systems Integration, Bayes Theorem, Carcinoma, Squamous Cell genetics, Genomics methods, Head and Neck Neoplasms genetics, Medical Informatics Applications, Neural Networks, Computer
- Abstract
With the completion of the Human Genome Project and the growing computational challenges presented by the large amount of genomic data available today, machine learning is becoming an integral part of biomedical research and plays a major role in the emerging fields of bioinformatics and computational biology. This situation offers unparalleled opportunities and unprecedented challenges to machine learning research in general and to Bayesian learning methods in particular. This paper outlines some of the opportunities and the challenges of this endeavor, it describes where the efforts of "cracking the code of life" can most benefit from a Bayesian approach, and it identifies some potential applications of Bayesian machine learning methods to the genomic analysis of squamous cell carcinomas of the head and neck.
- Published
- 2003
- Full Text
- View/download PDF
50. Minimal haplotype tagging.
- Author
-
Sebastiani P, Lazarus R, Weiss ST, Kunkel LM, Kohane IS, and Ramoni MF
- Subjects
- Algorithms, Black People genetics, Evolution, Molecular, Female, Genetic Variation, Humans, Male, Models, Genetic, White People genetics, Black or African American, Haplotypes genetics, Polymorphism, Single Nucleotide
- Abstract
The high frequency of single-nucleotide polymorphisms (SNPs) in the human genome presents an unparalleled opportunity to track down the genetic basis of common diseases. At the same time, the sheer number of SNPs also makes unfeasible genome-wide disease association studies. The haplotypic nature of the human genome, however, lends itself to the selection of a parsimonious set of SNPs, called haplotype tagging SNPs (htSNPs), able to distinguish the haplotypic variations in a population. Current approaches rely on statistical analysis of transmission rates to identify htSNPs. In contrast to these approximate methods, this contribution describes an exact, analytical, and lossless method, called BEST (Best Enumeration of SNP Tags), able to identify the minimum set of SNPs tagging an arbitrary set of haplotypes from either pedigree or independent samples. Our results confirm that a small proportion of SNPs is sufficient to capture the haplotypic variations in a population and that this proportion decreases exponentially as the haplotype length increases. We used BEST to tag the haplotypes of 105 genes in an African-American and a European-American sample. An interesting finding of this analysis is that the vast majority (95%) of the htSNPs in the European-American sample is a subset of the htSNPs of the African-American sample. This result seems to provide further evidence that a severe bottleneck occurred during the founding of Europe and the conjectured "Out of Africa" event.
- Published
- 2003
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.