1. A gzip-based algorithm to identify bacterial families by 16S rRNA.
- Author
-
Santoni, D. and Romano-Spica, V.
- Subjects
- *
MICROORGANISMS , *RNA , *NUCLEOTIDE sequence , *MICROBIOLOGY , *ALGORITHMS - Abstract
Aims: Microbial family identification of 16S rDNA sequences by applying a strategy based on algorithms for data compression. Methods and Results: Perl scripts were developed to analyse similarities in microbial sequences, based on a gzip data compression technique. For each bacterial family ( n = 196) a 16S rRNA reference file was constructed to compare new queries looking at compression performance. An online user-friendly bioinformatics tool was built up to attribute a bacterial family to a 16S rRNA sequence. It was successfully applied to recognize different bacterial families, including Legionellaceae, Bacillaceae, Enterobacteriaceae, Acetobacteriaceae and Rhizobiaceae. The percentage of positive identifications is higher than 95% for fragments over 450 bp. Conclusions: A new bioinformatics approach has been developed to assign a taxonomic classification to a 16SrDNA sequence. An online tool provides quick and easy sequence attribution. The general principle can be applied to other genes of taxonomic interest. Significance and Impact of the Study: Availability of simple bioinformatics tools can support the development of molecular-based analysis and classification of bacteria, especially for environmental or uncultured strains. [ABSTRACT FROM AUTHOR]
- Published
- 2006
- Full Text
- View/download PDF