1. NASA GeneLab RNA-seq consensus pipeline: standardized processing of short-read RNA-seq data
- Author
-
Komal S. Rathi, Egle Cekanaviciute, Colin P.S. Kruse, Sara Brin Rosenthal, Eliah G. Overbey, Shayoni Ray, Robert Meller, Daniel C. Berrios, Ted Liefeld, Raúl Herranz, Gary Hardiman, Sarah E. Wyatt, Richard Barker, Kathleen M. Fisch, Norman G. Lewis, Matthew Geniza, Sylvain V. Costes, Amanda M. Saravia-Butler, Michael J. Strong, Laurence B. Davin, Simon Gilroy, Tejaswini Mishra, Chris Wolverton, Joshua P. Vandenbrink, Zhe Zhang, Michael D. Lee, Silvio Weging, Alicia Villacampa, Joseph J. Bass, Homer Fogle, Sigrid Reinsch, Elizabeth A. Blaber, Luis Zea, Rachel Gilbert, Jonathan M. Galazka, Willian A. da Silveira, J. Tyson McDonald, Samrawit G. Gebre, Yared H. Kidane, Nathaniel J. Szewczyk, Imara Y. Perera, Deanne Taylor, Helio A. Costa, Afshin Beheshti, Candice Tahimic, National Aeronautics and Space Administration (US), Biotechnology and Biological Sciences Research Council (UK), Centre for Musculoskeletal Ageing Research (UK), Agencia Estatal de Investigación (España), Nottingham Biomedical Research Centre (UK), Overbey, Eliah G. [0000-0002-2866-8294], Fogle, Homer [0000-0002-5579-5432], Beheshti, Afshin [0000-0003-4643-531X], Berrios, Daniel C. [0000-0003-4312-9552], Cekanaviciute, Egle [0000-0003-3306-1806], Davin, Laurence B. [0000-0002-3248-6485], Gebre, Samrawit [0000-0002-8963-4856], Geniza, Matthew [0000-0003-4828-7891], Gilroy, Simon [0000-0001-9597-6839], Hardiman, Gary [0000-0003-4558-0400], Herranz, Raúl [0000-0002-0246-9449], Kruse, Colin P. S. [0000-0001-7070-8889], Mishra, Tejaswini [0000-0001-9931-1260], Perera, Imara Y. [0000-0001-9421-1420], Ray, Shayoni [0000-0003-1911-7738], Reinsch, Sigrid [0000-0002-6484-7521], Rosenthal, Sara Brin [0000-0002-6548-9658], Strong, Michael [0000-0002-3247-6260], Szewczyk, Nathaniel [0000-0003-4425-9746], Tahimic, Candice G. T. [0000-0001-5862-2652], Taylor, Deanne M. [0000-0002-3302-4610], Villacampa, Alicia [0000-0002-7398-8545], Weging, Silvio [0000-0002-8484-4352], Wolverton, Chris [0000-0003-2248-474X], Wyatt, Sarah E. [0000-0001-7874-0509], Costes, Sylvain V. [0000-0002-8542-2389], Galazka, Jonathan M. [0000-0002-4153-0249], Overbey, Eliah G., Fogle, Homer, Beheshti, Afshin, Berrios, Daniel C., Cekanaviciute, Egle, Davin, Laurence B., Gebre, Samrawit, Geniza, Matthew, Gilroy, Simon, Hardiman, Gary, Herranz, Raúl, Kruse, Colin P. S., Mishra, Tejaswini, Perera, Imara Y., Ray, Shayoni, Reinsch, Sigrid, Rosenthal, Sara Brin, Strong, Michael, Szewczyk, Nathaniel, Tahimic, Candice G. T., Taylor, Deanne M., Villacampa, Alicia, Weging, Silvio, Wolverton, Chris, Wyatt, Sarah E., Costes, Sylvain V., and Galazka, Jonathan M.
- Subjects
0301 basic medicine ,Data processing ,Multidisciplinary ,Computer science ,Science ,Pipeline (computing) ,Analysis working ,Omics ,RNA-Seq ,02 engineering and technology ,021001 nanoscience & nanotechnology ,Short read ,computer.software_genre ,Article ,Transcriptome ,03 medical and health sciences ,030104 developmental biology ,Differentially expressed genes ,Gene expression ,Data mining ,0210 nano-technology ,Space Sciences ,Gene ,computer - Abstract
Summary With the development of transcriptomic technologies, we are able to quantify precise changes in gene expression profiles from astronauts and other organisms exposed to spaceflight. Members of NASA GeneLab and GeneLab-associated analysis working groups (AWGs) have developed a consensus pipeline for analyzing short-read RNA-sequencing data from spaceflight-associated experiments. The pipeline includes quality control, read trimming, mapping, and gene quantification steps, culminating in the detection of differentially expressed genes. This data analysis pipeline and the results of its execution using data submitted to GeneLab are now all publicly available through the GeneLab database. We present here the full details and rationale for the construction of this pipeline in order to promote transparency, reproducibility, and reusability of pipeline data; to provide a template for data processing of future spaceflight-relevant datasets; and to encourage cross-analysis of data from other databases with the data available in GeneLab., Graphical abstract, Highlights • Analysis of omics data from different spaceflight studies presents unique challenges • A standardized pipeline for RNA-seq analysis eliminates data processing variation • The GeneLab RNA-seq pipeline includes QC, trimming, mapping, quantification, and DGE • Space-relevant data processed with this pipeline are available at genelab.nasa.gov, Omics; Space Sciences
- Published
- 2021
- Full Text
- View/download PDF