Back to Search
Start Over
dBBQs: dataBase of Bacterial Quality scores
- Source :
- BMC Bioinformatics, Vol 18, Iss S14, Pp 147-153 (2017), BMC Bioinformatics
- Publication Year :
- 2017
- Publisher :
- Springer Science and Business Media LLC, 2017.
-
Abstract
- BackgroundIt is well-known that genome sequencing technologies are becoming significantly cheaper and faster. As a result of this, the exponential growth in sequencing data in public databases allows us to explore ever growing large collections of genome sequences. However, it is less known that the majority of available sequenced genome sequences in public databases are not complete, drafts of varying qualities. We have calculated quality scores for around 100,000 bacterial genomes from all major genome repositories and put them in a fast and easy-to-use database.ResultsProkaryotic genomic data from all sources were collected and combined to make a non-redundant set of bacterial genomes. The genome quality score for each was calculated by four different measurements: assembly quality, number of rRNA and tRNA genes, and the occurrence of conserved functional domains. The dataBase of Bacterial Quality scores (dBBQs) was designed to store and retrieve quality scores. It offers fast searching and download features which the result can be used for further analysis. In addition, the search results are shown in interactive JavaScript chart framework using DC.js. The analysis of quality scores across major public genome databases find that around 68% of the genomes are of acceptable quality for many uses.ConclusionsdBBQs (available at http://arc-gem.uams.edu/dbbqs) provides genome quality scores for all available prokaryotic genome sequences with a user-friendly Web-interface. These scores can be used as cut-offs to get a high-quality set of genomes for testing bioinformatics tools or improving the analysis. Moreover, all data of the four measurements that were combined to make the quality score for each genome, which can potentially be used for further analysis. dBBQs will be updated regularly and is freely use for non-commercial purpose.
- Subjects :
- 0301 basic medicine
Computer science
media_common.quotation_subject
Bacterial genome size
Biology
JavaScript
lcsh:Computer applications to medicine. Medical informatics
computer.software_genre
Biochemistry
Genome
DNA sequencing
Database
Set (abstract data type)
User-Computer Interface
03 medical and health sciences
0302 clinical medicine
Structural Biology
Databases, Genetic
Genome quality score
Quality (business)
lcsh:QH301-705.5
Molecular Biology
Gene
computer.programming_language
media_common
Internet
Base Sequence
Bacteria
Applied Mathematics
Chromosome Mapping
Genomics
Computer Science Applications
ComputingMethodologies_PATTERNRECOGNITION
030104 developmental biology
lcsh:Biology (General)
Quality Score
lcsh:R858-859.7
DNA microarray
computer
Genome, Bacterial
030217 neurology & neurosurgery
Subjects
Details
- ISSN :
- 14712105
- Volume :
- 18
- Database :
- OpenAIRE
- Journal :
- BMC Bioinformatics
- Accession number :
- edsair.doi.dedup.....7d42e8c4de665af28ff04cbc4b33790c