Back to Search
Start Over
Explorer: efficient DNA coding by De Bruijn graph toward arbitrary local and global biochemical constraints.
- Source :
- Briefings in Bioinformatics; Sep2024, Vol. 25 Issue 5, p1-10, 10p
- Publication Year :
- 2024
-
Abstract
- With the exponential growth of digital data, there is a pressing need for innovative storage media and techniques. DNA molecules, due to their stability, storage capacity, and density, offer a promising solution for information storage. However, DNA storage also faces numerous challenges, such as complex biochemical constraints and encoding efficiency. This paper presents Explorer , a high-efficiency DNA coding algorithm based on the De Bruijn graph, which leverages its capability to characterize local sequences. Explorer enables coding under various biochemical constraints, such as homopolymers, GC content, and undesired motifs. This paper also introduces Codeformer , a fast decoding algorithm based on the transformer architecture, to further enhance decoding efficiency. Numerical experiments indicate that, compared with other advanced algorithms, Explorer not only achieves stable encoding and decoding under various biochemical constraints but also increases the encoding efficiency and bit rate by ¿10%. Additionally, Codeformer demonstrates the ability to efficiently decode large quantities of DNA sequences. Under different parameter settings, its decoding efficiency exceeds that of traditional algorithms by more than two-fold. When Codeformer is combined with Reed–Solomon code, its decoding accuracy exceeds 99%, making it a good choice for high-speed decoding applications. These advancements are expected to contribute to the development of DNA-based data storage systems and the broader exploration of DNA as a novel information storage medium. [ABSTRACT FROM AUTHOR]
- Subjects :
- DE Bruijn graph
DECODING algorithms
DATA warehousing
DNA sequencing
DNA
Subjects
Details
- Language :
- English
- ISSN :
- 14675463
- Volume :
- 25
- Issue :
- 5
- Database :
- Complementary Index
- Journal :
- Briefings in Bioinformatics
- Publication Type :
- Academic Journal
- Accession number :
- 179874068
- Full Text :
- https://doi.org/10.1093/bib/bbae363