Back to Search
Start Over
A Tool for Creating and Parallelizing Bioinformatics Pipelines
- Source :
- DTIC
- Publication Year :
- 2007
-
Abstract
- Bioinformatics pipelines enable hfr scientists to effectively analyze biological data through automated multi-step processes constructed by individual programs and databases. The huge amount of data and time consuming computations require effectively parallelized pipehnes to provide results within a reasonable time. To reduce researchers' programming burden for pipeline creation and parallelization, we developed the Bioinformatics Pipeline Generation and Parallelization Toolkit (B io Gent). A user needs only to create a pipehne definition file that describes the data processing sequence and input/output files. A program termed schedpipe in the BioGent toolkit takes the definition file and executes the designed procedure. Schedpipe automatically parallelizes the pipeline execution by performing independent data processing steps on muliple CPUs, and by decomposing big datasets into small chunks and processing them in parallel. Schedpipe controls program execution on multiple CPUs through a simple application programming interface (API) of the Parallel Job Manager (PJM) library. As a part of the BioGent toolkit, PJM was developed to effectively launch and monitor programs on multiple CPUs using a Message Passing Interface (MPI) protocol. The PJMAPI can also be used to parallelize other serial programs. A demonstration using PJM for parallelization shows 10% to 50% savings in time compared to an indigenous parallelization through a batch queuing system.<br />The original document contains color images. All DTIC reproductions will be in black and white. Presented at the DoD High Performance Computer Modernization Program (HPCMP) Users Group Conference (USG)(2007): A Bridge to Future Defense held in Pittsburgh, PA on 18-21 June 2007. Published in Proceedings of the DoD High Performance Computer Modernization Program (HPCMP) Users Group Conference (USG), p417-420, June 2007. Publisher: IEEE Computer Society, Conference Publishing Services (CPS). ISBN 0-7695-3088-5 and ISBN 978-0-7695-3088-8. Sponsored in part by the Department of Defense. This article is from ADA488707 Proceedings of the HPCMP Users Group Conference 2007. High Performance Computing Modernization Program: A Bridge to Future Defense held 18-21 June 2007 in Pittsburgh, Pennsylvania
Details
- Database :
- OAIster
- Journal :
- DTIC
- Notes :
- text/html, English
- Publication Type :
- Electronic Resource
- Accession number :
- edsoai.ocn832033947
- Document Type :
- Electronic Resource