Back to Search
Start Over
SeqSQC: A Bioconductor Package for Evaluating the Sample Quality of Next-generation Sequencing Data
- Source :
- Genomics, Proteomics & Bioinformatics, Vol 17, Iss 2, Pp 211-218 (2019)
- Publication Year :
- 2019
- Publisher :
- Elsevier, 2019.
-
Abstract
- As next-generation sequencing (NGS) technology has become widely used to identify genetic causal variants for various diseases and traits, a number of packages for checking NGS data quality have sprung up in public domains. In addition to the quality of sequencing data, sample quality issues, such as gender mismatch, abnormal inbreeding coefficient, cryptic relatedness, and population outliers, can also have fundamental impact on downstream analysis. However, there is a lack of tools specialized in identifying problematic samples from NGS data, often due to the limitation of sample size and variant counts. We developed SeqSQC, a Bioconductor package, to automate and accelerate sample cleaning in NGS data of any scale. SeqSQC is designed for efficient data storage and access, and equipped with interactive plots for intuitive data visualization to expedite the identification of problematic samples. SeqSQC is available at http://bioconductor.org/packages/SeqSQC. Keywords: Next-generation sequencing, Quality assessment, 1000 Genomes Project, Whole-exome sequencing, Bioconductor package
- Subjects :
- Computer science
Population
Method
Breast Neoplasms
Sample (statistics)
computer.software_genre
Biochemistry
Cohort Studies
Bioconductor
03 medical and health sciences
0302 clinical medicine
Data visualization
Exome Sequencing
Genetics
Humans
education
Molecular Biology
lcsh:QH301-705.5
030304 developmental biology
0303 health sciences
education.field_of_study
Genome, Human
business.industry
Racial Groups
High-Throughput Nucleotide Sequencing
Computational Mathematics
Identification (information)
lcsh:Biology (General)
Sample size determination
Data quality
Outlier
Female
Data mining
business
computer
Software
030217 neurology & neurosurgery
Subjects
Details
- Language :
- English
- ISSN :
- 16720229
- Volume :
- 17
- Issue :
- 2
- Database :
- OpenAIRE
- Journal :
- Genomics, Proteomics & Bioinformatics
- Accession number :
- edsair.doi.dedup.....68062c6221a5cc16025a96f800a2e168