1. Direct Superbubble Detection
- Author
-
Fabian Gärtner and Peter F. Stadler
- Subjects
superbubble ,depth-first search ,cycles ,linear time algorithm ,Industrial engineering. Management engineering ,T55.4-60.8 ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
Superbubbles are a class of induced subgraphs in digraphs that play an essential role in assembly algorithms for high-throughput sequencing data. They are connected with the remainder of the host digraph by a single entrance and a single exit vertex. Linear-time algorithms for the enumeration superbubbles recently have become available. Current approaches require the decomposition of the input digraph into strongly-connected components, which are then analyzed separately. In principle, a single depth-first search could be used, provided one can guarantee that the root of the depth-first search (DFS)-tree is not itself located in the interior or the exit point of a superbubble. Here, we describe a linear-time algorithm to determine suitable roots for a DFS-forest that is guaranteed to identify the superbubbles in a digraph correctly. In addition to the advantages of a more straightforward implementation, we observe a nearly three-fold gain in performance on real-world datasets. We present a reference implementation of the new algorithm that accepts many commonly-used input formats for digraphs. It is available as open source from github.
- Published
- 2019
- Full Text
- View/download PDF