Back to Search
Start Over
Virtual Grid Engine: a simulated grid engine environment for large-scale supercomputers
Virtual Grid Engine: a simulated grid engine environment for large-scale supercomputers
- Source :
- BMC Bioinformatics, Vol 20, Iss S16, Pp 1-10 (2019), BMC Bioinformatics
- Publication Year :
- 2019
- Publisher :
- BMC, 2019.
-
Abstract
- Background Supercomputers have become indispensable infrastructures in science and industries. In particular, most state-of-the-art scientific results utilize massively parallel supercomputers ranked in TOP500. However, their use is still limited in the bioinformatics field due to the fundamental fact that the asynchronous parallel processing service of Grid Engine is not provided on them. To encourage the use of massively parallel supercomputers in bioinformatics, we developed middleware called Virtual Grid Engine, which enables software pipelines to automatically perform their tasks as MPI programs. Result We conducted basic tests to check the time required to assign jobs to workers by VGE. The results showed that the overhead of the employed algorithm was 246 microseconds and our software can manage thousands of jobs smoothly on the K computer. We also tried a practical test in the bioinformatics field. This test included two tasks, the split and BWA alignment of input FASTQ data. 25,055 nodes (2,000,440 cores) were used for this calculation and accomplished it in three hours. Conclusion We considered that there were four important requirements for this kind of software, non-privilege server program, multiple job handling, dependency control, and usability. We carefully designed and checked all requirements. And this software fulfilled all the requirements and achieved good performance in a large scale analysis.
- Subjects :
- Distributed computing
TOP500
02 engineering and technology
lcsh:Computer applications to medicine. Medical informatics
Biochemistry
User-Computer Interface
03 medical and health sciences
Software
Computer Systems
Structural Biology
020204 information systems
0202 electrical engineering, electronic engineering, information engineering
Humans
Computer Simulation
Molecular Biology
Massively parallel
lcsh:QH301-705.5
030304 developmental biology
computer.programming_language
0303 health sciences
business.industry
Applied Mathematics
Python (programming language)
Grid
Supercomputer
Grid engine
Computer Science Applications
lcsh:Biology (General)
Asynchronous communication
Middleware
lcsh:R858-859.7
MPI
High performance computing
business
computer
Algorithms
Python
Subjects
Details
- Language :
- English
- ISSN :
- 14712105
- Volume :
- 20
- Database :
- OpenAIRE
- Journal :
- BMC Bioinformatics
- Accession number :
- edsair.doi.dedup.....72b690bc7a42083b676ac46420bca010