Back to Search
Start Over
Correcting a Single Indel/Edit for DNA-Based Data Storage: Linear-Time Encoders and Order-Optimality
- Source :
- IEEE Transactions on Information Theory. 67:3438-3451
- Publication Year :
- 2021
- Publisher :
- Institute of Electrical and Electronics Engineers (IEEE), 2021.
-
Abstract
- An indel refers to a single insertion or deletion, while an edit refers to a single insertion, deletion or substitution. In this article, we investigate codes that correct either a single indel or a single edit and provide linear-time algorithms that encode binary messages into these codes of length n. Over the quaternary alphabet, we provide two linear-time encoders. One corrects a single edit with $\lceil {\log \text {n}}\rceil+\text {O}(\log \log \text {n})$ redundancy bits, while the other corrects a single indel with $\lceil {\log \text {n}}\rceil+2$ redundant bits. These two encoders are order-optimal . The former encoder is the first known order-optimal encoder that corrects a single edit, while the latter encoder (that corrects a single indel) reduces the redundancy of the best known encoder of Tenengolts (1984) by at least four bits. Over the DNA alphabet, we impose an additional constraint: the $\mathtt {GC}$ -balanced constraint and require that exactly half of the symbols of any DNA codeword to be either $\mathtt {C}$ or $\mathtt {G}$ . In particular, via a modification of Knuthâs balancing technique, we provide a linear-time map that translates binary messages into $\mathtt {GC}$ -balanced codewords and the resulting codebook is able to correct a single indel or a single edit. These are the first known constructions of $\mathtt {GC}$ -balanced codes that correct a single indel or a single edit.
- Subjects :
- Code word
Order (ring theory)
Binary number
020206 networking & telecommunications
Data_CODINGANDINFORMATIONTHEORY
02 engineering and technology
Library and Information Sciences
Computer Science Applications
Combinatorics
Redundancy (information theory)
0202 electrical engineering, electronic engineering, information engineering
Indel
Time complexity
Encoder
Decoding methods
Information Systems
Mathematics
Subjects
Details
- ISSN :
- 15579654 and 00189448
- Volume :
- 67
- Database :
- OpenAIRE
- Journal :
- IEEE Transactions on Information Theory
- Accession number :
- edsair.doi...........81f62ca2f2a8a0e71a99bd0baf8efaff