33 results on '"Natalja Kurbatova"'
Search Results
2. Disease ontologies for knowledge graphs
- Author
-
Natalja Kurbatova and Rowan Swiers
- Subjects
Ontologies ,Knowledge graph ,Data integration ,Computer applications to medicine. Medical informatics ,R858-859.7 ,Biology (General) ,QH301-705.5 - Abstract
Abstract Background Data integration to build a biomedical knowledge graph is a challenging task. There are multiple disease ontologies used in data sources and publications, each having its hierarchy. A common task is to map between ontologies, find disease clusters and finally build a representation of the chosen disease area. There is a shortage of published resources and tools to facilitate interactive, efficient and flexible cross-referencing and analysis of multiple disease ontologies commonly found in data sources and research. Results Our results are represented as a knowledge graph solution that uses disease ontology cross-references and facilitates switching between ontology hierarchies for data integration and other tasks. Conclusions Grakn core with pre-installed “Disease ontologies for knowledge graphs” facilitates the biomedical knowledge graph build and provides an elegant solution for the multiple disease ontologies problem.
- Published
- 2021
- Full Text
- View/download PDF
3. Urinary metabolic phenotyping for Alzheimer’s disease
- Author
-
Natalja Kurbatova, Manik Garg, Luke Whiley, Elena Chekmeneva, Beatriz Jiménez, María Gómez-Romero, Jake Pearce, Torben Kimhofer, Ellie D’Hondt, Hilkka Soininen, Iwona Kłoszewska, Patrizia Mecocci, Magda Tsolaki, Bruno Vellas, Dag Aarsland, Alejo Nevado-Holgado, Benjamine Liu, Stuart Snowden, Petroula Proitsi, Nicholas J. Ashton, Abdul Hye, Cristina Legido-Quigley, Matthew R. Lewis, Jeremy K. Nicholson, Elaine Holmes, Alvis Brazma, and Simon Lovestone
- Subjects
Medicine ,Science - Abstract
Abstract Finding early disease markers using non-invasive and widely available methods is essential to develop a successful therapy for Alzheimer’s Disease. Few studies to date have examined urine, the most readily available biofluid. Here we report the largest study to date using comprehensive metabolic phenotyping platforms (NMR spectroscopy and UHPLC-MS) to probe the urinary metabolome in-depth in people with Alzheimer’s Disease and Mild Cognitive Impairment. Feature reduction was performed using metabolomic Quantitative Trait Loci, resulting in the list of metabolites associated with the genetic variants. This approach helps accuracy in identification of disease states and provides a route to a plausible mechanistic link to pathological processes. Using these mQTLs we built a Random Forests model, which not only correctly discriminates between people with Alzheimer’s Disease and age-matched controls, but also between individuals with Mild Cognitive Impairment who were later diagnosed with Alzheimer’s Disease and those who were not. Further annotation of top-ranking metabolic features nominated by the trained model revealed the involvement of cholesterol-derived metabolites and small-molecules that were linked to Alzheimer’s pathology in previous studies.
- Published
- 2020
- Full Text
- View/download PDF
4. A large scale hearing loss screen reveals an extensive unexplored genetic landscape for auditory dysfunction
- Author
-
Michael R. Bowl, Michelle M. Simon, Neil J. Ingham, Simon Greenaway, Luis Santos, Heather Cater, Sarah Taylor, Jeremy Mason, Natalja Kurbatova, Selina Pearson, Lynette R. Bower, Dave A. Clary, Hamid Meziane, Patrick Reilly, Osamu Minowa, Lois Kelsey, The International Mouse Phenotyping Consortium, Glauco P. Tocchini-Valentini, Xiang Gao, Allan Bradley, William C. Skarnes, Mark Moore, Arthur L. Beaudet, Monica J. Justice, John Seavitt, Mary E. Dickinson, Wolfgang Wurst, Martin Hrabe de Angelis, Yann Herault, Shigeharu Wakana, Lauryl M. J. Nutter, Ann M. Flenniken, Colin McKerlie, Stephen A. Murray, Karen L. Svenson, Robert E. Braun, David B. West, K. C. Kent Lloyd, David J. Adams, Jacqui White, Natasha Karp, Paul Flicek, Damian Smedley, Terrence F. Meehan, Helen E. Parkinson, Lydia M. Teboul, Sara Wells, Karen P. Steel, Ann-Marie Mallon, and Steve D. M. Brown
- Subjects
Science - Abstract
The full extent of the genetic basis for hearing impairment is unknown. Here, as part of the International Mouse Phenotyping Consortium, the authors perform a hearing loss screen in 3006 mouse knockout strains and identify 52 new candidate genes for genetic hearing loss.
- Published
- 2017
- Full Text
- View/download PDF
5. Prevalence of sexual dimorphism in mammalian phenotypic traits
- Author
-
Natasha A. Karp, Jeremy Mason, Arthur L. Beaudet, Yoav Benjamini, Lynette Bower, Robert E. Braun, Steve D.M. Brown, Elissa J. Chesler, Mary E. Dickinson, Ann M. Flenniken, Helmut Fuchs, Martin Hrabe de Angelis, Xiang Gao, Shiying Guo, Simon Greenaway, Ruth Heller, Yann Herault, Monica J. Justice, Natalja Kurbatova, Christopher J. Lelliott, K.C. Kent Lloyd, Ann-Marie Mallon, Judith E. Mank, Hiroshi Masuya, Colin McKerlie, Terrence F. Meehan, Richard F. Mott, Stephen A. Murray, Helen Parkinson, Ramiro Ramirez-Solis, Luis Santos, John R. Seavitt, Damian Smedley, Tania Sorg, Anneliese O. Speak, Karen P. Steel, Karen L. Svenson, International Mouse Phenotyping Consortium, Shigeharu Wakana, David West, Sara Wells, Henrik Westerberg, Shay Yaacoby, and Jacqueline K. White
- Subjects
Science - Abstract
Systemic dissection of sexually dimorphic phenotypes in mice is lacking. Here, Karp and the International Mouse Phenotype Consortium show that approximately 10% of qualitative traits and 56% of quantitative traits in mice as measured in laboratory setting are sexually dimorphic.
- Published
- 2017
- Full Text
- View/download PDF
6. Applying the ARRIVE Guidelines to an In Vivo Database.
- Author
-
Natasha A Karp, Terry F Meehan, Hugh Morgan, Jeremy C Mason, Andrew Blake, Natalja Kurbatova, Damian Smedley, Julius Jacobsen, Richard F Mott, Vivek Iyer, Peter Matthews, David G Melvin, Sara Wells, Ann M Flenniken, Hiroshi Masuya, Shigeharu Wakana, Jacqueline K White, K C Kent Lloyd, Corey L Reynolds, Richard Paylor, David B West, Karen L Svenson, Elissa J Chesler, Martin Hrabě de Angelis, Glauco P Tocchini-Valentini, Tania Sorg, Yann Herault, Helen Parkinson, Ann-Marie Mallon, and Steve D M Brown
- Subjects
Biology (General) ,QH301-705.5 - Abstract
The Animal Research: Reporting of In Vivo Experiments (ARRIVE) guidelines were developed to address the lack of reproducibility in biomedical animal studies and improve the communication of research findings. While intended to guide the preparation of peer-reviewed manuscripts, the principles of transparent reporting are also fundamental for in vivo databases. Here, we describe the benefits and challenges of applying the guidelines for the International Mouse Phenotyping Consortium (IMPC), whose goal is to produce and phenotype 20,000 knockout mouse strains in a reproducible manner across ten research centres. In addition to ensuring the transparency and reproducibility of the IMPC, the solutions to the challenges of applying the ARRIVE guidelines in the context of IMPC will provide a resource to help guide similar initiatives in the future.
- Published
- 2015
- Full Text
- View/download PDF
7. PhenStat: A Tool Kit for Standardized Analysis of High Throughput Phenotypic Data.
- Author
-
Natalja Kurbatova, Jeremy C Mason, Hugh Morgan, Terrence F Meehan, and Natasha A Karp
- Subjects
Medicine ,Science - Abstract
The lack of reproducibility with animal phenotyping experiments is a growing concern among the biomedical community. One contributing factor is the inadequate description of statistical analysis methods that prevents researchers from replicating results even when the original data are provided. Here we present PhenStat--a freely available R package that provides a variety of statistical methods for the identification of phenotypic associations. The methods have been developed for high throughput phenotyping pipelines implemented across various experimental designs with an emphasis on managing temporal variation. PhenStat is targeted to two user groups: small-scale users who wish to interact and test data from large resources and large-scale users who require an automated statistical analysis pipeline. The software provides guidance to the user for selecting appropriate analysis methods based on the dataset and is designed to allow for additions and modifications as needed. The package was tested on mouse and rat data and is used by the International Mouse Phenotyping Consortium (IMPC). By providing raw data and the version of PhenStat used, resources like the IMPC give users the ability to replicate and explore results within their own computing environment.
- Published
- 2015
- Full Text
- View/download PDF
8. IsoCleft Finder – a web-based tool for the detection and analysis of protein binding-site geometric and chemical similarities [v2; ref status: indexed, http://f1000r.es/13y]
- Author
-
Natalja Kurbatova, Matthieu Chartier, María Inés Zylber, and Rafael Najmanovich
- Subjects
Biomacromolecule-Ligand Interactions ,Drug Discovery & Design ,Theory & Simulation ,Medicine ,Science - Abstract
IsoCleft Finder is a web-based tool for the detection of local geometric and chemical similarities between potential small-molecule binding cavities and a non-redundant dataset of ligand-bound known small-molecule binding-sites. The non-redundant dataset developed as part of this study is composed of 7339 entries representing unique Pfam/PDB-ligand (hetero group code) combinations with known levels of cognate ligand similarity. The query cavity can be uploaded by the user or detected automatically by the system using existing PDB entries as well as user-provided structures in PDB format. In all cases, the user can refine the definition of the cavity interactively via a browser-based Jmol 3D molecular visualization interface. Furthermore, users can restrict the search to a subset of the dataset using a cognate-similarity threshold. Local structural similarities are detected using the IsoCleft software and ranked according to two criteria (number of atoms in common and Tanimoto score of local structural similarity) and the associated Z-score and p-value measures of statistical significance. The results, including predicted ligands, target proteins, similarity scores, number of atoms in common, etc., are shown in a powerful interactive graphical interface. This interface permits the visualization of target ligands superimposed on the query cavity and additionally provides a table of pairwise ligand topological similarities. Similarities between top scoring ligands serve as an additional tool to judge the quality of the results obtained. We present several examples where IsoCleft Finder provides useful functional information. IsoCleft Finder results are complementary to existing approaches for the prediction of protein function from structure, rational drug design and x-ray crystallography. IsoCleft Finder can be found at: http://bcb.med.usherbrooke.ca/isocleftfinder.
- Published
- 2013
- Full Text
- View/download PDF
9. Exploration of Evolutionary Relations between Protein Structures.
- Author
-
Natalja Kurbatova and Juris Viksna
- Published
- 2008
- Full Text
- View/download PDF
10. Detection of 3D atomic similarities and their use in the discrimination of small molecule protein-binding sites.
- Author
-
Rafael Najmanovich, Natalja Kurbatova, and Janet M. Thornton
- Published
- 2008
- Full Text
- View/download PDF
11. Protein Structure Comparison Based on Fold Evolution.
- Author
-
Natalja Kurbatova, Laura Mancinska, and Juris Viksna
- Published
- 2007
12. A comparative anatomical and morphological study of vegetative organs of Iris sogdiana Bunge from natural populations of southeastern Kazakhstan
- Author
-
Phytointroduction, Chinargul Aldassugyrova, Nadezhda Gemejiyeva, Madina Ramazanova, and Natalja Kurbatova
- Subjects
medicine.anatomical_structure ,fungi ,Botany ,medicine ,General Medicine ,Iris (anatomy) ,Biology ,Natural (archaeology) - Abstract
A comparative anatomical and morphological analysis of the vegetative organs of Iris sogdiana from various growing conditions has interest that allow to identify features and adaptive capabilities of the species. Microscopic studies of I. sogdiana vegetative organs from natural populations of southeastern Kazakhstan were carried out for the first time. A positive correlation between the growth habitat and morphometric parameters of the species has been shown. The structural features of vegetative organs were revealed and it's established that development of the tissues that covers the plant, ground and vascular in I. sogdiana is associated with a moisturizing gradient. Numerous stomata are characteristic for the leaf blade slightly submerged deep into the leaf; numerous intercellular spaces; airways in the leaf mesophyll; vascular bundles surrounded by sclerenchyma cells. The stem has a certain degree of ribbing a thickened surface of the outer cells of epidermis, a pronounced layer of pericycle sclerenchyma. The root has a one-two-layer ectoderm developed thickening of the radial membranes of the endoderm cells and a different number of vessels that depending on the growth location. The study results indicates that plants of 1 and 2 populations are characterized by more xerophytic features of the organization and for 3 population plants mesophytic features are inherent. The studied species is distinguished by averaged morphometric indicators of the structure and includes the mesoxerophytic organization of the anatomical structure. The morphological structure of I. sogdiana allows to conclude that a sufficient amount of moisture is necessary for a short period of vegetation and for development of the plant.
- Published
- 2020
13. Disease ontologies for knowledge graphs
- Author
-
Rowan Swiers and Natalja Kurbatova
- Subjects
0301 basic medicine ,QH301-705.5 ,Computer science ,Computer applications to medicine. Medical informatics ,R858-859.7 ,Information Storage and Retrieval ,Ontology (information science) ,computer.software_genre ,Biochemistry ,Pattern Recognition, Automated ,Task (project management) ,03 medical and health sciences ,0302 clinical medicine ,Disease Ontology ,Structural Biology ,Ethnicity ,Ontologies ,Humans ,030212 general & internal medicine ,Biology (General) ,Representation (mathematics) ,Molecular Biology ,Knowledge graph ,Hierarchy ,Information retrieval ,Applied Mathematics ,Computer Science Applications ,Knowledge ,030104 developmental biology ,Biological Ontologies ,Core (graph theory) ,Graph (abstract data type) ,Data integration ,computer ,Software - Abstract
Background Data integration to build a biomedical knowledge graph is a challenging task. There are multiple disease ontologies used in data sources and publications, each having its hierarchy. A common task is to map between ontologies, find disease clusters and finally build a representation of the chosen disease area. There is a shortage of published resources and tools to facilitate interactive, efficient and flexible cross-referencing and analysis of multiple disease ontologies commonly found in data sources and research. Results Our results are represented as a knowledge graph solution that uses disease ontology cross-references and facilitates switching between ontology hierarchies for data integration and other tasks. Conclusions Grakn core with pre-installed “Disease ontologies for knowledge graphs” facilitates the biomedical knowledge graph build and provides an elegant solution for the multiple disease ontologies problem.
- Published
- 2021
14. IsoCleft Finder – a web-based tool for the detection and analysis of protein binding-site geometric and chemical similarities [version 2; referees: 2 approved, 1 approved with reservations]
- Author
-
Natalja Kurbatova, Matthieu Chartier, María Inés Zylber, and Rafael Najmanovich
- Subjects
Web Tool ,Articles ,Biomacromolecule-Ligand Interactions ,Drug Discovery & Design ,Theory & Simulation ,Ligand ,Tanimoto ,protein structure ,Pfam ,PDB - Abstract
IsoCleft Finder is a web-based tool for the detection of local geometric and chemical similarities between potential small-molecule binding cavities and a non-redundant dataset of ligand-bound known small-molecule binding-sites. The non-redundant dataset developed as part of this study is composed of 7339 entries representing unique Pfam/PDB-ligand (hetero group code) combinations with known levels of cognate ligand similarity. The query cavity can be uploaded by the user or detected automatically by the system using existing PDB entries as well as user-provided structures in PDB format. In all cases, the user can refine the definition of the cavity interactively via a browser-based Jmol 3D molecular visualization interface. Furthermore, users can restrict the search to a subset of the dataset using a cognate-similarity threshold. Local structural similarities are detected using the IsoCleft software and ranked according to two criteria (number of atoms in common and Tanimoto score of local structural similarity) and the associated Z-score and p-value measures of statistical significance. The results, including predicted ligands, target proteins, similarity scores, number of atoms in common, etc., are shown in a powerful interactive graphical interface. This interface permits the visualization of target ligands superimposed on the query cavity and additionally provides a table of pairwise ligand topological similarities. Similarities between top scoring ligands serve as an additional tool to judge the quality of the results obtained. We present several examples where IsoCleft Finder provides useful functional information. IsoCleft Finder results are complementary to existing approaches for the prediction of protein function from structure, rational drug design and x-ray crystallography. IsoCleft Finder can be found at: http://bcb.med.usherbrooke.ca/isocleftfinder.
- Published
- 2013
- Full Text
- View/download PDF
15. IsoCleft Finder – a web-based tool for the detection and analysis of protein binding-site geometric and chemical similarities [version 1; referees: awaiting peer review]
- Author
-
Natalja Kurbatova, Matthieu Chartier, María Inés Zylber, and Rafael Najmanovich
- Subjects
Web Tool ,Articles ,Biomacromolecule-Ligand Interactions ,Drug Discovery & Design ,Theory & Simulation ,Ligand ,Tanimoto ,protein structure ,Pfam ,PDB - Abstract
IsoCleft Finder is a web-based tool for the detection of local geometric and chemical similarities between potential small-molecule binding cavities and a non-redundant dataset of ligand-bound known small-molecule binding-sites. The non-redundant dataset developed as part of this study is composed of 7339 entries representing unique Pfam/PDB-ligand (hetero group code) combinations with known levels of cognate ligand similarity. The query cavity can be uploaded by the user or detected automatically by the system using existing PDB entries as well as user-provided structures in PDB format. In all cases, the user can refine the definition of the cavity interactively via a browser-based Jmol 3D molecular visualization interface. Furthermore, users can restrict the search to a subset of the dataset using a cognate-similarity threshold. Local structural similarities are detected using the IsoCleft software and ranked according to two criteria (number of atoms in common and Tanimoto score of local structural similarity) and the associated Z-score and p-value measures of statistical significance. The results, including predicted ligands, target proteins, similarity scores, number of atoms in common, etc., are shown in a powerful interactive graphical interface. This interface permits the visualization of target ligands superimposed on the query cavity and additionally provides a table of pairwise ligand topological similarities. Similarities between top scoring ligands serve as an additional tool to judge the quality of the results obtained. We present several examples where IsoCleft Finder provides useful functional information. IsoCleft Finder results are complementary to existing approaches for the prediction of protein function from structure, rational drug design and x-ray crystallography. IsoCleft Finder can be found at: http://bcb.med.usherbrooke.ca/isocleftfinder.
- Published
- 2013
- Full Text
- View/download PDF
16. [F1–02–02]: DISCOVERY AND VALIDATION OF MULTIMODAL BIOMARKER SIGNATURES RELATING TO ALZHEIMER'S DISEASE PATHOLOGY AND PROGRESSION
- Author
-
Richard Dobson, Alejo J. Nevado-Holgado, Torben Kimhofer, Anders Wallin, Frederik Barkhof, Pablo Martinez-Lage, Anoushka Leslie, Julius Popp, Natalja Kurbatova, Alison L. Baird, Sarah Westwood, Beatriz Jiménez, Giovanni B. Frisoni, Cristina Legido-Quigley, Nicholas J. Ashton, Stephanie J.B. Vos, Lynn Maslen, Abdul Hye, Jake T M Pearce, Pieter Jelle Visser, Ellie D’Hondt, Olivia Kowalczyk, Stuart G. Snowden, Eric Westman, Rik Vandenberghe, Henrik Zetterberg, Alberto Lleó, Matthew R. Lewis, Wilfried Verachtert, Benjamine Young Liu, David Ruvolo, Philip Scheltens, María Gómez-Romero, Danai Dima, Luke Whiley, Petroula Proitsi, Mark David, Nicola Voyle, Elaine Holmes, Sebastiaan Engelborghs, Johannes Streffer, Isabelle Bos, Toby J. Athersuch, Simon Lovestone, Lars Bertram, José Luis Molinuevo, and Tom Ashby
- Subjects
Pathology ,medicine.medical_specialty ,Epidemiology ,business.industry ,Health Policy ,Genomics ,Disease ,medicine.disease ,Proteomics ,Psychiatry and Mental health ,Cellular and Molecular Neuroscience ,Developmental Neuroscience ,Cohort ,medicine ,Dementia ,Biomarker (medicine) ,Neurology (clinical) ,Geriatrics and Gerontology ,Cognitive decline ,Biomarker discovery ,business - Abstract
Background Biomarkers of Alzheimer’s disease (AD) pathology and progression have now been identified across various modalities. The aims of the two studies presented (ANM-MMB and EMIF-AD biomarker discovery) are to discover and replicate previously identified biomarkers of disease pathology and progression, and moreover to determine whether a multimodal biomarker signature may add value in comparison to a biomarker of a single modality. Methods The ANM-MMB cohort is comprised of 718 AD, MCI converters and non-converters, and control subjects selected from the AddNeuroMed, Alzheimer’s research trust and Dementia case register cohorts. Cognitive measures, serum and urine metabolomics, structural MRI, genomics, whole blood transcriptomics and plasma proteomics data was available. The EMIF-AD biomarker discovery cohort consists of 1221 AD, MCI and control subjects, selected from the EMIF catalogue. All subjects had existing amyloid measures (CSF Aβ or amyloid-PET), structural MRI and clinical data, and furthermore plasma proteomics (targeted and untargeted), CSF proteomics (targeted), metabolomics, genomics and epigenetics data were generated. For both studies univariate and multivariate statistics were utilised to identify candidate biomarkers of AD pathology (neurodegeneration and/or brain amyloid burden), rates of cognitive decline, and MCI progression to dementia. pQTL-eQTL-mQTL analyses, network/pathway analysis, and multimodal classifiers were employed to detect multimodal signatures. Results Initial analyses indicate that in the ANM-MMB study a serum and urine derived 15 metabolite classifier predicts MCI progression to AD with 72% accuracy, and the biological significance of the metabolites included in the biomarker panel was identified. Further analyses will examine whether a multimodal classifier is able to predict with even greater accuracy. We will then seek to replicate this in the EMIF-AD biomarker discovery study. Further analyses will also examine single and multimodal biomarker classifiers of other endophenotypes. Conclusions These two studies could be used to identify novel and replicate previously identified single modality biomarker findings. Furthermore the impact of combining the additional modalities with these findings will be discussed. Computational and technical challenges encountered and the bioinformatics pipeline devised in the multimodal analysis of the ANM-MMB cohort will be used to inform the analysis pipeline of the EMIF-AD biomarker discovery study as a replication.
- Published
- 2017
17. Prevalence of sexual dimorphism in mammalian phenotypic traits
- Author
-
Damian Smedley, Colin McKerlie, Xiang Gao, Henrik Westerberg, Simon Greenaway, Monica J. Justice, Hiroshi Masuya, Elissa J. Chesler, Robert E. Braun, Mary E. Dickinson, Shay Yaacoby, Stephen A. Murray, Karen L. Svenson, Jeremy Mason, Martin Hrabé de Angelis, Luis Santos, Tania Sorg, Christopher J. Lelliott, Sara Wells, Ann M. Flenniken, Ruth Heller, Ann-Marie Mallon, Lynette Bower, Karen P. Steel, Helen Parkinson, Judith E. Mank, Arthur L. Beaudet, Kevin C K Lloyd, Richard Mott, Yann Herault, Yoav Benjamini, Jacqueline K. White, Steve D.M. Brown, Shiying Guo, John R. Seavitt, Helmut Fuchs, Natalja Kurbatova, Anneliese O. Speak, Natasha A. Karp, Ramiro Ramirez-Solis, Terrence F. Meehan, David B. West, Shigeharu Wakana, The Wellcome Trust Sanger Institute [Cambridge], AstraZeneca [Cambridge, UK], European Bioinformatics Institute [Hinxton] (EMBL-EBI), EMBL Heidelberg, Baylor College of Medicine (BCM), Baylor University, Tel Aviv University (TAU), University of California [Davis] (UC Davis), University of California (UC), The Jackson Laboratory [Bar Harbor] (JAX), MRC Harwell Institute [UK], Helmholtz Zentrum München = German Research Center for Environmental Health, Technische Universität München = Technical University of Munich (TUM), German Center for Diabetes Research - Deutsches Zentrum für Diabetesforschung [Neuherberg] (DZD), Nanjing University (NJU), Institut de Génétique et de Biologie Moléculaire et Cellulaire (IGBMC), Université de Strasbourg (UNISTRA)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Centre National de la Recherche Scientifique (CNRS), French National Infrastructure for Mouse Phenogenomics (PHENOMIN), Institut Clinique de la Souris (ICS), The Hospital for sick children [Toronto] (SickKids), University College of London [London] (UCL), RIKEN BioResource Research Center [Tsukuba, Japan] (RIKEN BRC), Queen Mary University of London (QMUL), King‘s College London, Children's Hospital Oakland Research Institute (CHORI), International Mouse Phenotyping Consortium: Yuichi Obata, Tomohiro Suzuki, Masaru Tamura, Hideki Kaneda, Tamio Furuse, Kimio Kobayashi, Ikuo Miura, Ikuko Yamada, Nobuhiko Tanaka, Atsushi Yoshiki, Shinya Ayabe, David A Clary, Heather A Tolentino, Michael A Schuchbauer, Todd Tolentino, Joseph Anthony Aprile, Sheryl M Pedroia, Lois Kelsey, Igor Vukobradovic, Zorana Berberovic, Celeste Owen, Dawei Qu, Ruolin Guo, Susan Newbigging, Lily Morikawa, Napoleon Law, Xueyuan Shang, Patricia Feugas, Yanchun Wang, Mohammad Eskandarian, Yingchun Zhu, Lauryl M J Nutter, Patricia Penton, Valerie Laurin, Shannon Clarke, Qing Lan, Khondoker Sohel, David Miller, Greg Clark, Jane Hunter, Jorge Cabezas, Mohammed Bubshait, Tracy Carroll, Sandra Tondat, Suzanne MacMaster, Monica Pereira, Marina Gertsenstein, Ozge Danisment, Elsa Jacob, Amie Creighton, Gillian Sleep, James Clark, Lydia Teboul, Martin Fray, Adam Caulder, Jorik Loeffler, Gemma Codner, James Cleak, Sara Johnson, Zsombor Szoke-Kovacs, Adam Radage, Marina Maritati, Joffrey Mianne, Wendy Gardiner, Susan Allen, Heather Cater, Michelle Stewart, Piia Keskivali-Bond, Caroline Sinclair, Ellen Brown, Brendan Doe, Hannah Wardle-Jones, Evelyn Grau, Nicola Griggs, Mike Woods, Helen Kundi, Mark N D Griffiths, Christian Kipp, David G Melvin, Navis P S Raj, Simon A Holroyd, David J Gannon, Rafael Alcantara, Antonella Galli, Yvette E Hooks, Catherine L Tudor, Angela L Green, Fiona L Kussy, Elizabeth J Tuck, Emma J Siragher, Simon A Maguire, David T Lafont, Valerie E Vancollie, Selina A Pearson, Amy S Gates, Mark Sanderson, Carl Shannon, Lauren F E Anthony, Maksymilian T Sumowski, Robbie S B McLaren, Agnieszka Swiatkowska, Christopher M Isherwood, Emma L Cambridge, Heather M Wilson, Susana S Caetano, Cecilia Icoresi Mazzeo, Monika H Dabrowska, Charlotte Lillistone, Jeanne Estabel, Anna Karin B Maguire, Laura-Anne Roberson, Guillaume Pavlovic, Marie-Christine Birling, Wattenhofer-Donze Marie, Sylvie Jacquot, Abdel Ayadi, Dalila Ali-Hadji, Philippe Charles, Philippe André, Elise Le Marchand, Amal El Amri, Laurent Vasseur, Antonio Aguilar-Pimentel, Lore Becker, Irina Treise, Kristin Moreth, Tobias Stoeger, Oana V Amarie, Frauke Neff, Wolfgang Wurst, Raffi Bekeredjian, Markus Ollert, Thomas Klopstock, Julia Calzada-Wack, Susan Marschall, Robert Brommage, Ralph Steinkamp, Christoph Lengger, Manuela A Östereicher, Holger Maier, Claudia Stoeger, Stefanie Leuchtenberger, AliÖ Yildrim, Lillian Garrett, Sabine M Hölter, Annemarie Zimprich, Claudia Seisenberger, Antje Bürger, Jochen Graw, Oliver Eickelberg, Andreas Zimmer, Eckhard Wolf, Dirk H Busch, Martin Klingenspor, Carsten Schmidt-Weber, Valérie Gailus-Durner, Johannes Beckers, Birgit Rathkolb, Jan Rozman, univOAK, Archive ouverte, Mason, Jeremy [0000-0002-2796-5123], Chesler, Elissa J [0000-0002-5642-5062], Angelis, Martin Hrabe de [0000-0002-7898-2353], Herault, Yann [0000-0001-7049-6900], Lelliott, Christopher J [0000-0001-8087-4530], McKerlie, Colin [0000-0002-2232-0967], Wakana, Shigeharu [0000-0001-8532-0924], Yaacoby, Shay [0000-0002-2583-4170], and Apollo - University of Cambridge Repository
- Subjects
0301 basic medicine ,Genotype ,Science ,Mutant ,General Physics and Astronomy ,[SDV.GEN] Life Sciences [q-bio]/Genetics ,Quantitative trait locus ,Biology ,Article ,General Biochemistry, Genetics and Molecular Biology ,03 medical and health sciences ,Mice ,Quantitative Trait ,Quantitative Trait, Heritable ,Genetics ,Animals ,Modifier ,Gene ,Heritable ,Mammals ,Sex Characteristics ,[SDV.GEN]Life Sciences [q-bio]/Genetics ,Multidisciplinary ,Genes, Modifier ,Body Weight ,General Chemistry ,Phenotypic trait ,Phenotype ,Sexual dimorphism ,030104 developmental biology ,Genes ,Evolutionary biology ,Female ,International Mouse Phenotyping Consortium ,Sex characteristics - Abstract
The role of sex in biomedical studies has often been overlooked, despite evidence of sexually dimorphic effects in some biological studies. Here, we used high-throughput phenotype data from 14,250 wildtype and 40,192 mutant mice (representing 2,186 knockout lines), analysed for up to 234 traits, and found a large proportion of mammalian traits both in wildtype and mutants are influenced by sex. This result has implications for interpreting disease phenotypes in animal models and humans., Systemic dissection of sexually dimorphic phenotypes in mice is lacking. Here, Karp and the International Mouse Phenotype Consortium show that approximately 10% of qualitative traits and 56% of quantitative traits in mice as measured in laboratory setting are sexually dimorphic.
- Published
- 2017
18. ArrayExpress update—simplifying data submissions
- Author
-
Robert Petryszak, Ekaterina Pilicheva, Karyn Megy, Alvis Brazma, Tony Burdett, Gabriella Rustici, Olga Melnichuk, Emma Hastings, Ugis Sarkans, Helen Parkinson, Marco Brandizi, Natalja Kurbatova, Andrew Tikhonov, Eleanor Williams, Miroslaw Dylag, Nikolay Kolesnikov, Maria Keays, and Y. Amy Tang
- Subjects
Internet ,business.industry ,Gene Expression Profiling ,High-Throughput Nucleotide Sequencing ,Genomics ,Biology ,Data submission ,Bioinformatics ,World Wide Web ,Databases, Genetic ,Genetics ,Database Issue ,The Internet ,business ,Software ,Oligonucleotide Array Sequence Analysis - Abstract
The ArrayExpress Archive of Functional Genomics Data (http://www.ebi.ac.uk/arrayexpress) is an international functional genomics database at the European Bioinformatics Institute (EMBL-EBI) recommended by most journals as a repository for data supporting peer-reviewed publications. It contains data from over 7000 public sequencing and 42 000 array-based studies comprising over 1.5 million assays in total. The proportion of sequencing-based submissions has grown significantly over the last few years and has doubled in the last 18 months, whilst the rate of microarray submissions is growing slightly. All data in ArrayExpress are available in the MAGE-TAB format, which allows robust linking to data analysis and visualization tools and standardized analysis. The main development over the last two years has been the release of a new data submission tool Annotare, which has reduced the average submission time almost 3-fold. In the near future, Annotare will become the only submission route into ArrayExpress, alongside MAGE-TAB format-based pipelines. ArrayExpress is a stable and highly accessed resource. Our future tasks include automation of data flows and further integration with other EMBL-EBI resources for the representation of multi-omics data.
- Published
- 2014
19. ArrayExpress update-trends in database growth and links to data analysis tools
- Author
-
Emma Hastings, Nikolay Kolesnikov, Tony Burdett, Annalisa Mupo, Ugis Sarkans, Helen Parkinson, Jon Ison, Alvis Brazma, Danielle Welter, Roby Mani, Y. Amy Tang, Anjan Sharma, James Malone, Natalja Kurbatova, Gabriella Rustici, Anna Farne, Ibrahim Emam, Tobias Ternent, Ekaterina Pilicheva, Marco Brandizi, Miroslaw Dylag, Maria Keays, Rui Pedro Pereira, Johan Rung, Andrew Tikhonov, and Eleanor Williams
- Subjects
Gene expression omnibus ,0303 health sciences ,Internet ,Information retrieval ,Sequencing data ,High-Throughput Nucleotide Sequencing ,Genomics ,Articles ,Biology ,Bioinformatics ,Microarray Analysis ,Bioconductor ,03 medical and health sciences ,User-Computer Interface ,0302 clinical medicine ,030220 oncology & carcinogenesis ,Databases, Genetic ,Genetics ,Analysis tools ,Functional genomics ,Software ,030304 developmental biology - Abstract
The ArrayExpress Archive of Functional Genomics Data (http://www.ebi.ac.uk/arrayexpress) is one of three international functional genomics public data repositories, alongside the Gene Expression Omnibus at NCBI and the DDBJ Omics Archive, supporting peer-reviewed publications. It accepts data generated by sequencing or array-based technologies and currently contains data from almost a million assays, from over 30 000 experiments. The proportion of sequencing-based submissions has grown significantly over the last 2 years and has reached, in 2012, 15% of all new data. All data are available from ArrayExpress in MAGE-TAB format, which allows robust linking to data analysis and visualization tools, including Bioconductor and GenomeSpace. Additionally, R objects, for microarray data, and binary alignment format files, for sequencing data, have been generated for a significant proportion of ArrayExpress data.
- Published
- 2013
- Full Text
- View/download PDF
20. Gene Expression Atlas update--a value-added database of microarray and sequencing-based functional genomics experiments
- Author
-
Alexey Filippov, Tomasz Adamusiak, Olga Melnichuk, Alvis Brazma, Nikolay Pultsin, Ravensara S. Travillian, Ele Holloway, Nataliya Kryvych, Misha Kapushesky, Eleanor Williams, Anna Farne, Robert Petryszak, Andrey Klebanov, Andrew Tikhonov, Natalja Kurbatova, James Malone, Tony Burdett, Gabriella Rustici, Aedín C. Culhane, Pavel Kurnosov, Helen Parkinson, and Andrey Zorin
- Subjects
European Nucleotide Archive ,Sequence analysis ,Genomics ,Biology ,computer.software_genre ,User-Computer Interface ,03 medical and health sciences ,Atlases as Topic ,0302 clinical medicine ,Databases, Genetic ,Genetics ,Humans ,Gene ,Oligonucleotide Array Sequence Analysis ,030304 developmental biology ,0303 health sciences ,Database ,Sequence Analysis, RNA ,Gene Expression Profiling ,Molecular Sequence Annotation ,Articles ,Gene expression profiling ,MicroRNAs ,Gene nomenclature ,computer ,Functional genomics ,030217 neurology & neurosurgery - Abstract
Gene Expression Atlas (http://www.ebi.ac.uk/gxa) is an added-value database providing information about gene expression in different cell types, organism parts, developmental stages, disease states, sample treatments and other biological/experimental conditions. The content of this database derives from curation, re-annotation and statistical analysis of selected data from the ArrayExpress Archive and the European Nucleotide Archive. A simple interface allows the user to query for differential gene expression either by gene names or attributes or by biological conditions, e.g. diseases, organism parts or cell types. Since our previous report we made 20 monthly releases and, as of Release 11.08 (August 2011), the database supports 19 species, which contains expression data measured for 19 014 biological conditions in 136 551 assays from 5598 independent studies.
- Published
- 2011
21. PhenStat: A Tool Kit for Standardized Analysis of High Throughput Phenotypic Data
- Author
-
Jeremy Mason, Hugh P. Morgan, Natalja Kurbatova, Natasha A. Karp, and Terrence F. Meehan
- Subjects
Male ,Computer science ,lcsh:Medicine ,Datasets as Topic ,Machine learning ,computer.software_genre ,Bioinformatics ,Mice ,High-Throughput Screening Assays ,Animals ,lcsh:Science ,Throughput (business) ,Reproducibility ,Multidisciplinary ,business.industry ,lcsh:R ,Reproducibility of Results ,Reference Standards ,Pipeline (software) ,Rats ,Identification (information) ,Phenotype ,Linear Models ,lcsh:Q ,Female ,Artificial intelligence ,Raw data ,business ,computer ,Software ,Test data ,Research Article - Abstract
The lack of reproducibility with animal phenotyping experiments is a growing concern among the biomedical community. One contributing factor is the inadequate description of statistical analysis methods that prevents researchers from replicating results even when the original data are provided. Here we present PhenStat – a freely available R package that provides a variety of statistical methods for the identification of phenotypic associations. The methods have been developed for high throughput phenotyping pipelines implemented across various experimental designs with an emphasis on managing temporal variation. PhenStat is targeted to two user groups: small-scale users who wish to interact and test data from large resources and large-scale users who require an automated statistical analysis pipeline. The software provides guidance to the user for selecting appropriate analysis methods based on the dataset and is designed to allow for additions and modifications as needed. The package was tested on mouse and rat data and is used by the International Mouse Phenotyping Consortium (IMPC). By providing raw data and the version of PhenStat used, resources like the IMPC give users the ability to replicate and explore results within their own computing environment.
- Published
- 2015
22. ontoCAT
- Author
-
Pavel Kurnosov, Tomasz Adamusiak, Morris A. Swertz, Natalja Kurbatova, Misha Kapushesky, Faculteit Medische Wetenschappen/UMCG, University of Groningen, Groningen Institute for Gastro Intestinal Genetics and Immunology (3GI), and Life Course Epidemiology (LCE)
- Subjects
Statistics and Probability ,Computer science ,Process ontology ,Data_MISCELLANEOUS ,02 engineering and technology ,Ontology (information science) ,Biochemistry ,Open Biomedical Ontologies ,World Wide Web ,03 medical and health sciences ,Terminology as Topic ,0202 electrical engineering, electronic engineering, information engineering ,Upper ontology ,Molecular Biology ,030304 developmental biology ,0303 health sciences ,Information retrieval ,SIMPLE (military communications protocol) ,Ontology-based data integration ,Suggested Upper Merged Ontology ,Computer Science Applications ,Computational Mathematics ,Tree traversal ,Computational Theory and Mathematics ,Vocabulary, Controlled ,Ontology ,020201 artificial intelligence & image processing ,INTEGRATION ,Software - Abstract
Motivation: There exist few simple and easily accessible methods to integrate ontologies programmatically in the R environment. We present ontoCAT—an R package to access ontologies in widely used standard formats, stored locally in the filesystem or available online. The ontoCAT package supports a number of traversal and search functions on a single ontology, as well as searching for ontology terms across multiple ontologies and in major ontology repositories. Availability: The package and sources are freely available in Bioconductor starting from version 2.8: http://bioconductor.org/help/bioc-views/release/bioc/html/ontoCAT.html or via the OntoCAT website http://www.ontocat.org/wiki/r. Contact: natalja@ebi.ac.uk; natalja@ebi.ac.uk
- Published
- 2011
23. ArrayExpress update--an archive of microarray and high-throughput sequencing-based functional genomics experiments
- Author
-
James Malone, Miroslaw Dylag, Marco Brandizi, Margus Lukk, Anjan Sharma, Alvis Brazma, Roby Mani, Natalja Kurbatova, Eleanor Williams, Niran Abeygunawardena, Gabriella Rustici, Ekaterina Pilicheva, Tomasz Adamusiak, Ele Holloway, Emma Hastings, Ibrahim Emam, Tony Burdett, Anna Farne, Nikolay Kolesnikov, Ugis Sarkans, Helen Parkinson, and Nataliya Sklyar
- Subjects
Gene expression omnibus ,Genetics ,0303 health sciences ,Extramural ,Gene Expression Profiling ,Gene Expression ,High-Throughput Nucleotide Sequencing ,Genomics ,Articles ,Computational biology ,Biology ,DNA sequencing ,Gene expression profiling ,03 medical and health sciences ,0302 clinical medicine ,030220 oncology & carcinogenesis ,Databases, Genetic ,Functional genomics ,Oligonucleotide Array Sequence Analysis ,030304 developmental biology - Abstract
The ArrayExpress Archive (http://www.ebi.ac.uk/arrayexpress) is one of the three international public repositories of functional genomics data supporting publications. It includes data generated by sequencing or array-based technologies. Data are submitted by users and imported directly from the NCBI Gene Expression Omnibus. The ArrayExpress Archive is closely integrated with the Gene Expression Atlas and the sequence databases at the European Bioinformatics Institute. Advanced queries provided via ontology enabled interfaces include queries based on technology and sample attributes such as disease, cell types and anatomy.
- Published
- 2010
24. The International Mouse Phenotyping Consortium Web Portal, a unified point of access for knockout mice and related phenotyping data
- Author
-
Paul Flicek, Mark Grifiths, Helen Parkinson, Duncan Sneddon, Chao-Kung Chen, Vivek Iyer, Peter Matthews, Steve D.M. Brown, Darren J. Oakley, Robert J. Wilson, William C. Skarnes, Armida Di Fenza, Jeremy Mason, Luis Santos, Tanja Fiegel, Gautier Koscielny, Ann-Marie Mallon, Andrew Blake, Hugh Morgan, Damian Smedley, Henrik Westerberg, Alan Horne, Asfand Qazi, Jonathan Warren, Ahmad Retha, Gagarine Yaikhom, Natalja Kurbatova, David Melvin, Richard Easty, Natasha A. Karp, Terrence F. Meehan, Julian Atienza-Herrero, and Jack Regnart
- Subjects
Computational biology ,Biology ,Bioinformatics ,computer.software_genre ,Open Biomedical Ontologies ,03 medical and health sciences ,Annotation ,Mice ,0302 clinical medicine ,Databases, Genetic ,Genetics ,Animals ,Statistical analysis ,Mammalian gene ,030304 developmental biology ,Mice, Knockout ,0303 health sciences ,Internet ,Point (typography) ,Biological Ontologies ,V. Human genome, model organisms, comparative genomics ,3. Good health ,Phenotype ,Knockout mouse ,computer ,030217 neurology & neurosurgery ,Data integration - Abstract
The International Mouse Phenotyping Consortium (IMPC) web portal (http://www.mousephenotype.org) provides the biomedical community with a unified point of access to mutant mice and rich collection of related emerging and existing mouse phenotype data. IMPC mouse clinics worldwide follow rigorous highly structured and standardized protocols for the experimentation, collection and dissemination of data. Dedicated 'data wranglers' work with each phenotyping center to collate data and perform quality control of data. An automated statistical analysis pipeline has been developed to identify knockout strains with a significant change in the phenotype parameters. Annotation with biomedical ontologies allows biologists and clinicians to easily find mouse strains with phenotypic traits relevant to their research. Data integration with other resources will provide insights into mammalian gene function and human disease. As phenotype data become available for every gene in the mouse, the IMPC web portal will become an invaluable tool for researchers studying the genetic contributions of genes to human diseases.
- Published
- 2013
25. Transcriptome and genome sequencing uncovers functional variation in humans
- Author
-
Irina Pulyakhina, Stephen B. Montgomery, Xavier Estivill, Katja Kahlem, Gabrielle Bertier, Angel Carracedo, Matti Pirinen, Peter Donnelly, Stylianos E. Antonarakis, Hans Lehrach, Thomas Meitinger, Olof Karlberg, Marc R. Friedländer, Michael Sammeth, Stefan Schreiber, Gert-Jan B. van Ommen, Andrew Tikhonov, Helena Kilpinen, Thomas Giger, Manuel A. Rivas, Pedro G. Ferreira, Ralf Sudbrak, Daniela Esser, Robert Häsler, Roderic Guigó, Oliver Stegle, Thomas Wieland, Ann-Christine Syvänen, Maarten van Iterson, Tuuli Lappalainen, Jean Monlong, Philip Rosenstiel, Daniel G. MacArthur, Sergi Beltran, Monkol Lek, Henk P. J. Buermans, Marta Gut, Peter A C 't Hoen, Emmanouil T. Dermitzakis, Natalja Kurbatova, Liliana Greger, Thasso Griebel, Paolo Ribeca, Tim M. Strom, Marc Sultan, Vyacheslav Amstislavskiy, Thomas Schwarzmayr, Matthias Barann, Alvis Brazma, Halit Ongen, Jonas Carlsson Almlöf, Ivo Gut, Paul Flicek, Esther Lizano, Mark I. McCarthy, Mar Gonzàlez-Porta, and Ismael Padioleau
- Subjects
Quantitative Trait Loci ,RNA, Messenger/analysis/genetics ,Computational biology ,Biology ,Genome ,Polymorphism, Single Nucleotide ,DNA sequencing ,Article ,Transcriptome ,03 medical and health sciences ,0302 clinical medicine ,Genetic variation ,Exons/genetics ,Humans ,ddc:576.5 ,RNA, Messenger ,1000 Genomes Project ,Gene ,Alleles ,030304 developmental biology ,Cell Line, Transformed ,Genetics ,0303 health sciences ,Multidisciplinary ,Genome, Human ,Sequence Analysis, RNA ,Gene Expression Profiling ,Genetic Variation ,High-Throughput Nucleotide Sequencing ,Exons ,Transcriptome/genetics ,Polymorphism, Single Nucleotide/genetics ,Human genetics ,Genetic Variation/genetics ,Genome, Human/genetics ,Human genome ,Quantitative Trait Loci/genetics ,030217 neurology & neurosurgery - Abstract
Genome sequencing projects are discovering millions of genetic variants in humans, and interpretation of their functional effects is essential for understanding the genetic basis of variation in human traits. Here we report sequencing and deep analysis of messenger RNA and microRNA from lymphoblastoid cell lines of 462 individuals from the 1000 Genomes Project - the first uniformly processed high-throughput RNA-sequencing data from multiple human populations with high-quality genome sequences. We discover extremely widespread genetic variation affecting the regulation of most genes, with transcript structure and expression level variation being equally common but genetically largely independent. Our characterization of causal regulatory variation sheds light on the cellular mechanisms of regulatory and loss-of-function variation, and allows us to infer putative causal variants for dozens of disease-associated loci. Altogether, this study provides a deep understanding of the cellular mechanisms of transcriptome variation and of the landscape of functional variants in the human genome. © 2013 Macmillan Publishers Limited. All rights reserved.
- Published
- 2013
26. graph2tab, a library to convert experimental workflow graphs into tabular formats
- Author
-
Marco Brandizi, Philippe Rocca-Serra, Natalja Kurbatova, and Ugis Sarkans
- Subjects
Statistics and Probability ,Source code ,Databases, Factual ,Computer science ,media_common.quotation_subject ,Databases and Ontologies ,computer.software_genre ,01 natural sciences ,Biochemistry ,Workflow technology ,Workflow ,Computer graphics ,03 medical and health sciences ,Documentation ,Computer Graphics ,0101 mathematics ,Molecular Biology ,030304 developmental biology ,media_common ,Oligonucleotide Array Sequence Analysis ,0303 health sciences ,Database ,010102 general mathematics ,Computational Biology ,Graph ,Computer Science Applications ,Computational Mathematics ,Applications Note ,Computational Theory and Mathematics ,Programming Languages ,computer - Abstract
Motivations: Spreadsheet-like tabular formats are ever more popular in the biomedical field as a mean for experimental reporting. The problem of converting the graph of an experimental workflow into a table-based representation occurs in many such formats and is not easy to solve. Results: We describe graph2tab, a library that implements methods to realise such a conversion in a size-optimised way. Our solution is generic and can be adapted to specific cases of data exporters or data converters that need to be implemented. Availability and Implementation: The library source code and documentation are available at http://github.com/ISA-tools/graph2tab. Contact: brandizi@ebi.ac.uk. Supplementary Information: A supplementary document describes the theoretical and technical details about the library implementation.
- Published
- 2012
27. OntoCAT - an integrated programming toolkit for common ontology application tasks
- Author
-
Natalja Kurbatova, Helen Parkinson, Tomasz Adamusiak, and Morris A. Swertz
- Subjects
Service (systems architecture) ,Java ,Multimedia ,Bioinformatics ,Computer science ,business.industry ,Reuse ,Ontology (information science) ,computer.software_genre ,World Wide Web ,Bioconductor ,Software ,Resource (project management) ,General Materials Science ,Web service ,business ,computer ,computer.programming_language - Abstract
OntoCAT provides high level abstraction for interacting with ontology resources including local ontology files in standard OWL and OBO formats (via OWL API) and public ontology repositories: EBI Ontology Lookup Service (OLS) and NCBO BioPortal. Each resource is wrapped behind easy to learn Java, Bioconductor/R and REST web service commands enabling reuse and integration of ontology software efforts despite variation in technologies.
- Published
- 2011
28. OntoCAT – an integrated programming toolkit for common ontology application tasks
- Author
-
Tomasz Adamusiak, Natalja Kurbatova, Morris Swertz, and Helen Parkinson
- Subjects
General Materials Science - Published
- 2011
29. OntoCAT - simple ontology search and integration in Java, R and REST/JavaScript
- Author
-
K. Joeri van der Velde, Despoina Antonakaki, Morris A. Swertz, Helen Parkinson, Niran Abeygunawardena, Tomasz Adamusiak, Tony Burdett, Misha Kapushesky, Natalja Kurbatova, and Life Course Epidemiology (LCE)
- Subjects
Ontology Inference Layer ,020205 medical informatics ,Databases, Factual ,Computer science ,Process ontology ,BIOLOGY ,02 engineering and technology ,Ontology (information science) ,lcsh:Computer applications to medicine. Medical informatics ,computer.software_genre ,Biochemistry ,Vocabulary ,World Wide Web ,Open Biomedical Ontologies ,03 medical and health sciences ,User-Computer Interface ,Structural Biology ,0202 electrical engineering, electronic engineering, information engineering ,Upper ontology ,Humans ,lcsh:QH301-705.5 ,Molecular Biology ,LOOKUP SERVICE ,030304 developmental biology ,GENE-EXPRESSION ,0303 health sciences ,Database ,Applied Mathematics ,Ontology-based data integration ,CONTROLLED VOCABULARY QUERIES ,BIOINFORMATICS ,Suggested Upper Merged Ontology ,Computational Biology ,PLATFORM ,ATLAS ,Computer Science Applications ,MODEL ,lcsh:Biology (General) ,Vocabulary, Controlled ,Ontology ,lcsh:R858-859.7 ,Programming Languages ,Ontology alignment ,computer ,Software - Abstract
Background Ontologies have become an essential asset in the bioinformatics toolbox and a number of ontology access resources are now available, for example, the EBI Ontology Lookup Service (OLS) and the NCBO BioPortal. However, these resources differ substantially in mode, ease of access, and ontology content. This makes it relatively difficult to access each ontology source separately, map their contents to research data, and much of this effort is being replicated across different research groups. Results OntoCAT provides a seamless programming interface to query heterogeneous ontology resources including OLS and BioPortal, as well as user-specified local OWL and OBO files. Each resource is wrapped behind easy to learn Java, Bioconductor/R and REST web service commands enabling reuse and integration of ontology software efforts despite variation in technologies. It is also available as a stand-alone MOLGENIS database and a Google App Engine application. Conclusions OntoCAT provides a robust, configurable solution for accessing ontology terms specified locally and from remote services, is available as a stand-alone tool and has been tested thoroughly in the ArrayExpress, MOLGENIS, EFO and Gen2Phen phenotype use cases. Availability http://www.ontocat.org
- Published
- 2011
30. A System for Information Management in BioMedical Studies--SIMBioMS
- Author
-
Mike Gostev, Mark I. McCarthy, Teemu Perheentupa, Johan Rung, Ugis Sarkans, Amy Barrett, Ilkka Lappalainen, Karlis Podnieks, Sudeshna Guha Neogi, Natalja Kurbatova, Andris Zarins, Juris Viksna, Alvis Brazma, Maria Krestyaninova, Peteris Rucevskis, and Juha Knuuttila
- Subjects
Statistics and Probability ,Information management ,Source code ,Databases, Factual ,Information Management ,Computer science ,media_common.quotation_subject ,Databases and Ontologies ,MEDLINE ,Information Storage and Retrieval ,Proteomics ,computer.software_genre ,Biochemistry ,World Wide Web ,03 medical and health sciences ,0302 clinical medicine ,Software ,Documentation ,Gene expression ,Information system ,Molecular Biology ,Genotyping ,030304 developmental biology ,media_common ,0303 health sciences ,business.industry ,Computational Biology ,Experimental data ,Computer Science Applications ,Applications Note ,Computational Mathematics ,Computational Theory and Mathematics ,Scripting language ,Database Management Systems ,business ,Functional genomics ,computer ,030217 neurology & neurosurgery - Abstract
Summary: SIMBioMS is a web-based open source software system for managing data and information in biomedical studies. It provides a solution for the collection, storage, management and retrieval of information about research subjects and biomedical samples, as well as experimental data obtained using a range of high-throughput technologies, including gene expression, genotyping, proteomics and metabonomics. The system can easily be customized and has proven to be successful in several large-scale multi-site collaborative projects. It is compatible with emerging functional genomics data standards and provides data import and export in accepted standard formats. Protocols for transferring data to durable archives at the European Bioinformatics Institute have been implemented. Availability: The source code, documentation and initialization scripts are available at http://simbioms.org. Contact: support@simbioms.org; mariak@ebi.ac.uk
- Published
- 2009
31. Detection of 3D atomic similarities and their use in the discrimination of small molecule protein-binding sites
- Author
-
Natalja Kurbatova, Janet M. Thornton, and Rafael Najmanovich
- Subjects
Statistics and Probability ,Models, Molecular ,Protein Conformation ,Molecular Sequence Data ,Sequence Homology ,Plasma protein binding ,Computational biology ,Biology ,computer.software_genre ,Biochemistry ,Sequence Analysis, Protein ,Computer Simulation ,Uniqueness ,Amino Acid Sequence ,Binding site ,Molecular Biology ,Protein function ,Sequence ,Binding Sites ,Ligand ,Discriminant Analysis ,Proteins ,Small molecule ,Computer Science Applications ,Computational Mathematics ,Computational Theory and Mathematics ,Models, Chemical ,Data mining ,computer ,Sequence Alignment ,Function (biology) ,Algorithms ,Protein Binding - Abstract
Motivation: Current computational methods for the prediction of function from structure are restricted to the detection of similarities and subsequent transfer of functional annotation. In a significant minority of cases, global sequence or structural (fold) similarities do not provide clues about protein function. In these cases, one alternative is to detect local binding site similarities. These may still reflect more distant evolutionary relationships as well as unique physico-chemical constraints necessary for binding similar ligands, thus helping pinpoint the function. In the present work, we ask the following question: is it possible to discriminate within a dataset of non-homologous proteins those that bind similar ligands based on their binding site similarities? Methods: We implement a graph-matching-based method for the detection of 3D atomic similarities introducing some simplifications that allow us to extend its applicability to the analysis of large allatom binding site models. This method, called IsoCleft, does not require atoms to be connected either in sequence or space. We apply the method to a cognate-ligand bound dataset of non-homologous proteins. We define a family of binding site models with decreasing knowledge about the identity of the ligand-interacting atoms to uncouple the questions of predicting the location of the binding site and detecting binding site similarities. Furthermore, we calculate the individual contributions of binding site size, chemical composition and geometry to prediction performance. Results: We find that it is possible to discriminate between different ligand-binding sites. In other words, there is a certain uniqueness in the set of atoms that are in contact to specific ligand scaffolds. This uniqueness is restricted to the atoms in close proximity of the ligand in which case, size and chemical composition alone are sufficient to discriminate binding sites. Discrimination ability decreases with decreasing knowledge about the identity of the ligand-interacting binding site atoms. The decrease is quite abrupt when considering size and chemical composition alone, but much slower when including geometry. We also observe that certain ligands are easier to discriminate. Interestingly, the subset of binding site atoms belonging to highly conserved residues is not sufficient to discriminate binding sites, implying that convergently evolved binding sites arrived at dissimilar solutions. Availability: IsoCleft can be obtained from the authors. Contact: rafael.najmanovich@ebi.ac.uk
- Published
- 2008
32. IsoCleft Finder Dataset version 1.6 (ICFDB v. 1.6)
- Author
-
Natalja Kurbatova, Matthieu Chartier, María Inés, Zylber, Rafael, Natalja Kurbatova, Matthieu Chartier, María Inés, and Zylber, Rafael
- Published
- 2013
- Full Text
- View/download PDF
33. Detection of 3D atomic similarities and their use in the discrimination of small molecule protein-binding sites.
- Author
-
Rafael Najmanovich, Natalja Kurbatova, and Janet Thornton
- Subjects
BUSINESS relocation ,IMMUNOGLOBULIN idiotypes ,IMMUNOSPECIFICITY ,BIOMOLECULES - Abstract
Motivation: Current computational methods for the prediction of function from structure are restricted to the detection of similarities and subsequent transfer of functional annotation. In a significant minority of cases, global sequence or structural (fold) similarities do not provide clues about protein function. In these cases, one alternative is to detect local binding site similarities. These may still reflect more distant evolutionary relationships as well as unique physico-chemical constraints necessary for binding similar ligands, thus helping pinpoint the function. In the present work, we ask the following question: is it possible to discriminate within a dataset of non-homologous proteins those that bind similar ligands based on their binding site similarities? Methods: We implement a graph-matching-based method for the detection of 3D atomic similarities introducing some simplifications that allow us to extend its applicability to the analysis of large allatom binding site models. This method, called IsoCleft, does not require atoms to be connected either in sequence or space. We apply the method to a cognate-ligand bound dataset of non-homologous proteins. We define a family of binding site models with decreasing knowledge about the identity of the ligand-interacting atoms to uncouple the questions of predicting the location of the binding site and detecting binding site similarities. Furthermore, we calculate the individual contributions of binding site size, chemical composition and geometry to prediction performance. Results: We find that it is possible to discriminate between different ligand-binding sites. In other words, there is a certain uniqueness in the set of atoms that are in contact to specific ligand scaffolds. This uniqueness is restricted to the atoms in close proximity of the ligand in which case, size and chemical composition alone are sufficient to discriminate binding sites. Discrimination ability decreases with decreasing knowledge about the identity of the ligand-interacting binding site atoms. The decrease is quite abrupt when considering size and chemical composition alone, but much slower when including geometry. We also observe that certain ligands are easier to discriminate. Interestingly, the subset of binding site atoms belonging to highly conserved residues is not sufficient to discriminate binding sites, implying that convergently evolved binding sites arrived at dissimilar solutions. Availability: IsoCleft can be obtained from the authors. Contact: rafael.najmanovich@ebi.ac.uk [ABSTRACT FROM AUTHOR]
- Published
- 2008
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.