Back to Search Start Over

Explorer: efficient DNA coding by De Bruijn graph toward arbitrary local and global biochemical constraints.

Authors :
Dou, Chang
Yang, Yijie
Zhu, Fei
Li, BingZhi
Duan, Yuping
Source :
Briefings in Bioinformatics; Sep2024, Vol. 25 Issue 5, p1-10, 10p
Publication Year :
2024

Abstract

With the exponential growth of digital data, there is a pressing need for innovative storage media and techniques. DNA molecules, due to their stability, storage capacity, and density, offer a promising solution for information storage. However, DNA storage also faces numerous challenges, such as complex biochemical constraints and encoding efficiency. This paper presents Explorer , a high-efficiency DNA coding algorithm based on the De Bruijn graph, which leverages its capability to characterize local sequences. Explorer enables coding under various biochemical constraints, such as homopolymers, GC content, and undesired motifs. This paper also introduces Codeformer , a fast decoding algorithm based on the transformer architecture, to further enhance decoding efficiency. Numerical experiments indicate that, compared with other advanced algorithms, Explorer not only achieves stable encoding and decoding under various biochemical constraints but also increases the encoding efficiency and bit rate by ¿10%. Additionally, Codeformer demonstrates the ability to efficiently decode large quantities of DNA sequences. Under different parameter settings, its decoding efficiency exceeds that of traditional algorithms by more than two-fold. When Codeformer is combined with Reed–Solomon code, its decoding accuracy exceeds 99%, making it a good choice for high-speed decoding applications. These advancements are expected to contribute to the development of DNA-based data storage systems and the broader exploration of DNA as a novel information storage medium. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
14675463
Volume :
25
Issue :
5
Database :
Complementary Index
Journal :
Briefings in Bioinformatics
Publication Type :
Academic Journal
Accession number :
179874068
Full Text :
https://doi.org/10.1093/bib/bbae363