1. The impact of quality filter for RNA-Seq.
- Author
-
de Sá PH, Veras AA, Carneiro AR, Pinheiro KC, Pinto AC, Soares SC, Schneider MP, Azevedo V, Silva A, and Ramos RT
- Subjects
- Corynebacterium pseudotuberculosis genetics, Microcystis genetics, Reproducibility of Results, RNA genetics, Sequence Analysis, RNA methods
- Abstract
Background: With the emergence of large-scale sequencing platforms since 2005, there has been a great revolution regarding methods for decoding DNA sequences, which have also affected quantitative and qualitative gene expression analyses through the RNA-Sequencing technique. However, issues related to the amount of data required for the analyses have been considered because they affect the reliability of the experiments. Thus, RNA depletion during sample preparation may influence the results. Moreover, because data produced by these platforms show variations in quality, quality filters are often used to remove sequences likely to contain errors to increase the accuracy of the results. However, when reads of quality filters are removed, the expression profile in RNA-Seq experiments may be influenced., Result: The present study aimed to analyze the impact of different quality filter values for Corynebacterium pseudotuberculosis (sequenced by SOLiD platform), Microcystis aeruginosa and Kineococcus radiotolerans (sequenced by Illumina platform) RNA-Seq data. Although up to 47.9% of the reads produced by the SOLiD technology were removed after the QV20 quality filter is applied, and 15.85% were removed from K. radiotolerans data set using the QV30 filter, Illumina data showed the largest number of unique differentially expressed genes after applying the most stringent filter (QV30), with 69 genes. In contrast, for SOLiD, the acid stress condition with the QV20 filter yielded only 41 unique differentially expressed genes. Even for the highest quality M. aeruginosa data, the quality filter affected the expression profile. The most stringent quality filter generated a greater number of unique differentially expressed genes: 9 for high molecular weight dissolved organic matter condition and 12 for low P conditions., Conclusion: Even high-accuracy sequencing technologies are subject to the influence of quality filters when evaluating RNA-Seq data using the reference approach., (Copyright © 2015. Published by Elsevier B.V.)
- Published
- 2015
- Full Text
- View/download PDF