Back to Search
Start Over
PANGEA: pipeline for analysis of next generation amplicons
- Source :
- The ISME Journal. 4:852-861
- Publication Year :
- 2010
- Publisher :
- Springer Science and Business Media LLC, 2010.
-
Abstract
- High-throughput DNA sequencing can identify organisms and describe population structures in many environmental and clinical samples. Current technologies generate millions of reads in a single run, requiring extensive computational strategies to organize, analyze and interpret those sequences. A series of bioinformatics tools for high-throughput sequencing analysis, including pre-processing, clustering, database matching and classification, have been compiled into a pipeline called PANGEA. The PANGEA pipeline was written in Perl and can be run on Mac OSX, Windows or Linux. With PANGEA, sequences obtained directly from the sequencer can be processed quickly to provide the files needed for sequence identification by BLAST and for comparison of microbial communities. Two different sets of bacterial 16S rRNA sequences were used to show the efficiency of this workflow. The first set of 16S rRNA sequences is derived from various soils from Hawaii Volcanoes National Park. The second set is derived from stool samples collected from diabetes-resistant and diabetes-prone rats. The workflow described here allows the investigator to quickly assess libraries of sequences on personal computers with customized databases. PANGEA is provided for users as individual scripts for each step in the process or as a single script where all processes, except the chi(2) step, are joined into one program called the 'backbone'.
- Subjects :
- DNA, Bacterial
Molecular Sequence Data
Population
Biology
computer.software_genre
Microbiology
Article
Workflow
Set (abstract data type)
Feces
RNA, Ribosomal, 16S
Diabetes Mellitus
Animals
Cluster analysis
education
Ecosystem
Soil Microbiology
Ecology, Evolution, Behavior and Systematics
computer.programming_language
Genetics
education.field_of_study
Bacteria
Database
Computational Biology
Sequence Analysis, DNA
Pipeline (software)
Rats
Identification (information)
Scripting language
Perl
computer
Software
Subjects
Details
- ISSN :
- 17517370 and 17517362
- Volume :
- 4
- Database :
- OpenAIRE
- Journal :
- The ISME Journal
- Accession number :
- edsair.doi.dedup.....3219f18769235dc906140b0f2ead6f71
- Full Text :
- https://doi.org/10.1038/ismej.2010.16