1. Ten steps to get started in Genome Assembly and Annotation
- Author
-
Lieven Sterck, Brane Leskošek, Lucile Soler, Cederic Notredame, Erik Hjerde, Stéphanie Bocs, Mahesh Binzer-Panchal, Laurent Bouri, Henrik Lantz, Joelle Amselem, Christophe Klopp, Jean-François Gibrat, Olga Vinnere Pettersson, Salvadors Capella-Gutierrez, Anna Vlasova, Victoria Dominguez Del Angel, Institut Français de Bioinformatique, Université Paris-Saclay, Department of Chemistry, RIDER UNIVERSITY, Department of Plant Biotechnology and Bioinformatics, Ghent University [Belgium] (UGENT), Department of Plant Systems Biology, VIB, Spanish National Bioinformatics Institute (INB), Centro Nacional de Supercomputación, BSC Electronics, Centre for Genomic Regulation (CRG), Universitat Pompeu Fabra [Barcelona], Uppsala Genome Center, NGI/SciLifeLab - Department of Immunology, Genetics and Pathology, Uppsala University, Unité de Recherche Génomique Info (URGI), Institut National de la Recherche Agronomique (INRA), Université de Tunis [Tunis], Laboratory of Technologies of Information and Communication and Electrical Engineering, National Superior School of Engineers of Tunis, Amélioration génétique et adaptation des plantes méditerranéennes et tropicales (UMR AGAP), Institut national d’études supérieures agronomiques de Montpellier (Montpellier SupAgro)-Institut National de la Recherche Agronomique (INRA)-Centre de Coopération Internationale en Recherche Agronomique pour le Développement (Cirad)-Centre international d'études supérieures en sciences agronomiques (Montpellier SupAgro), South Green Bioinformatics Platform, Unité de Mathématiques et Informatique Appliquées de Toulouse (MIAT INRA), Mathématiques et Informatique Appliquées du Génome à l'Environnement [Jouy-En-Josas] (MaIAGE), Faculty of Medicine, Institute for Biostatistics and Medical Informatics, University of Ljubljana, IMBIM/NBIS/SciLifeLab, Institut Français de Bioinformatique - UMS CNRS 3601 (IFB-CORE), Institut National de la Recherche Agronomique (INRA)-Institut National de Recherche en Informatique et en Automatique (Inria)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Centre National de la Recherche Scientifique (CNRS)-Institut National de la Santé et de la Recherche Médicale (INSERM), Universiteit Gent = Ghent University [Belgium] (UGENT), Center for Plant Systems Biology (PSB Center), Vlaams Instituut voor Biotechnologie [Ghent, Belgique] (VIB), Barcelona Supercomputing Center - Centro Nacional de Supercomputacion (BSC - CNS), Universitat Pompeu Fabra [Barcelona] (UPF), Université de Tunis, Centre de Coopération Internationale en Recherche Agronomique pour le Développement (Cirad)-Institut National de la Recherche Agronomique (INRA)-Centre international d'études supérieures en sciences agronomiques (Montpellier SupAgro)-Institut national d’études supérieures agronomiques de Montpellier (Montpellier SupAgro), Institut national d'enseignement supérieur pour l'agriculture, l'alimentation et l'environnement (Institut Agro)-Institut national d'enseignement supérieur pour l'agriculture, l'alimentation et l'environnement (Institut Agro), Barcelona Supercomputing Center, Institut National de la Recherche Agronomique (INRA)-Institut National de Recherche en Informatique et en Automatique (Inria)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Centre National de la Recherche Scientifique (CNRS), Universiteit Gent = Ghent University (UGENT), South Green Bioinformatics Platform [Montpellier], and Hjerde, Erik
- Subjects
0301 basic medicine ,Computer science ,General assembly ,Data management ,[SDV]Life Sciences [q-bio] ,Annotation ,Interoperability ,Assembly ,Sequence assembly ,Bioinformatik och systembiologi ,Genome ,General Biochemistry, Genetics and Molecular Biology ,F30 - Génétique et amélioration des plantes ,Workflows ,World Wide Web ,03 medical and health sciences ,DNA--Analysis ,Genomes ,General Pharmacology, Toxicology and Pharmaceutics ,ADN--Anàlisi ,FAIR ,Genome assembly ,Bioinformatics and Systems Biology ,General Immunology and Microbiology ,business.industry ,Biology and Life Sciences ,General Medicine ,Genome project ,Articles ,DNA ,Opinion Article ,L10 - Génétique et amélioration des animaux ,ELIXIR-EXCELERATE ,030104 developmental biology ,Workflow ,NGS ,business ,Ciències de la salut [Àrees temàtiques de la UPC] ,Genome annotation - Abstract
As a part of the ELIXIR-EXCELERATE efforts in capacity building, we present here 10 steps to facilitate researchers getting started in genome assembly and genome annotation. The guidelines given are broadly applicable, intended to be stable over time, and cover all aspects from start to finish of a general assembly and annotation project. Intrinsic properties of genomes are discussed, as is the importance of using high quality DNA. Different sequencing technologies and generally applicable workflows for genome assembly are also detailed. We cover structural and functional annotation and encourage readers to also annotate transposable elements, something that is often omitted from annotation workflows. The importance of data management is stressed, and we give advice on where to submit data and how to make your results Findable, Accessible, Interoperable, and Reusable (FAIR). ELIXIR-EXCELERATE is funded by the European Commission within the Research Infrastructures Programme of Horizon 2020 [676559].
- Published
- 2018
- Full Text
- View/download PDF