1. POSA: Perl Objects for DNA Sequencing Data Analysis
- Author
-
Martien A. M. Groenen, Jan Aerts, and B.J. Jungerius
- Subjects
lcsh:QH426-470 ,Test data generation ,Automated data processing ,lcsh:Biotechnology ,Biology ,Animal Breeding and Genomics ,system ,DNA sequencing ,Contig Mapping ,Software ,lcsh:TP248.13-248.65 ,Genetics ,Fokkerij en Genomica ,computer.programming_language ,business.industry ,DNA sequencing theory ,Sequence Analysis, DNA ,lcsh:Genetics ,Informatics ,WIAS ,Perl ,Software engineering ,business ,computer ,Biotechnology ,Personal genomics - Abstract
Background Capillary DNA sequencing machines allow the generation of vast amounts of data with little hands-on time. With this expansion of data generation, there is a growing need for automated data processing. Most available software solutions, however, still require user intervention or provide modules that need advanced informatics skills to allow implementation in pipelines. Results Here we present POSA, a pair of new perl objects that describe DNA sequence traces and Phrap contig assemblies in detail. Methods included in POSA include basecalling with quality scores (by Phred), contig assembly (by Phrap), generation of primer3 input and automated SNP annotation (by PolyPhred). Although easily implemented by users with only limited programming experience, these objects considerabily reduce hands-on analysis time compared to using the Staden package for extracting sequence information from raw sequencing files and for SNP discovery. Conclusions The POSA objects allow a flexible and easy design, implementation and usage of perl-based pipelines to handle and analyze DNA sequencing data, while requiring only minor programming skills.
- Published
- 2004