Back to Search
Start Over
Overcoming High Nanopore Basecaller Error Rates for DNA Storage Via Basecaller-Decoder Integration and Convolutional Codes
- Source :
- ICASSP
- Publication Year :
- 2019
- Publisher :
- Cold Spring Harbor Laboratory, 2019.
-
Abstract
- As magnetization and semiconductor based storage technologies approach their limits, bio-molecules, such as DNA, have been identified as promising media for future storage systems, due to their high storage density (petabytes/gram) and long-term durability (thousands of years). Furthermore, nanopore DNA sequencing enables high-throughput sequencing using devices as small as a USB thumb drive and thus is ideally suited for DNA storage applications. Due to the high insertion/deletion error rates associated with basecalled nanopore reads, current approaches rely heavily on consensus among multiple reads and thus incur very high reading costs. We propose a novel approach which overcomes the high error rates in basecalled sequences by integrating a Viterbi error correction decoder with the basecaller, enabling the decoder to exploit the soft information available in the deep learning based basecaller pipeline. Using convolutional codes for error correction, we experimentally observed 3x lower reading costs than the state-of-the-art techniques at comparable writing costs.The code, data and Supplementary Material is available at https://github.com/shubhamchandak94/nanopore_dna_storage.
- Subjects :
- Computer science
Pipeline (computing)
02 engineering and technology
Viterbi algorithm
03 medical and health sciences
chemistry.chemical_compound
symbols.namesake
0302 clinical medicine
0202 electrical engineering, electronic engineering, information engineering
Code (cryptography)
030304 developmental biology
0303 health sciences
business.industry
Reading (computer)
020206 networking & telecommunications
Dna storage
Nanopore
chemistry
Convolutional code
symbols
Nanopore sequencing
business
Error detection and correction
030217 neurology & neurosurgery
Computer hardware
DNA
Subjects
Details
- Language :
- English
- Database :
- OpenAIRE
- Journal :
- ICASSP
- Accession number :
- edsair.doi.dedup.....ea46816b9656d70f2d087ebdae280769
- Full Text :
- https://doi.org/10.1101/2019.12.20.871939