Deep DNA Storage: Scalable and Robust DNA Storage via Coding Theory and Deep Learning
@article{BarLev2021DeepDS, title={Deep DNA Storage: Scalable and Robust DNA Storage via Coding Theory and Deep Learning}, author={Dan Bar-Lev and Itai Orr and Omer Sabary and Tuvi Etzion and Eitan Yaakobi}, journal={ArXiv}, year={2021}, volume={abs/2109.00031} }
The concept of DNA storage was first suggested in 1959 by Richard Feynman who shared his vision regarding nanotechnology in the talk “There is plenty of room at the bottom”. Later, towards the end of the 20-th century, the interest in storage solutions based on DNA molecules was increased as a result of the human genome project which in turn led to a significant progress in sequencing and assembly methods. DNA storage enjoys major advantages over the well-established magnetic and optical…
Figures and Tables from this paper
3 Citations
Single-Read Reconstruction for DNA Data Storage Using Transformers
- Computer ScienceArXiv
- 2021
This work proposes a novel approach for single-read reconstruction using an encoder-decoder Transformer architecture for DNA based data storage and achieves lower error rates when reconstructing the original data from a single read of each DNA strand compared to state-of-the-art algorithms using 2-3 copies.
On The Decoding Error Weight of One or Two Deletion Channels
- Computer ScienceArXiv
- 2022
This paper studies optimal decoding for a special case of the deletion channel, referred by the k-deletion channel, which deletes exactly k symbols of the transmitted word uniformly at random, to understand how an optimal decoder operates in order to minimize the expected normalized distance.
The Input and Output Entropies of the k-Deletion/Insertion Channel
- Computer ScienceArXiv
- 2022
For both the 1-insertion and 1-deletion channels, it is proved that among all words with a fixed number of runs, the input entropy is minimized for words withA skewed distribution of their run lengths and it is maximized for wordsWith a balanced distribution oftheir run lengths.
References
SHOWING 1-10 OF 48 REFERENCES
Single-Read Reconstruction for DNA Data Storage Using Transformers
- Computer ScienceArXiv
- 2021
This work proposes a novel approach for single-read reconstruction using an encoder-decoder Transformer architecture for DNA based data storage and achieves lower error rates when reconstructing the original data from a single read of each DNA strand compared to state-of-the-art algorithms using 2-3 copies.
Portable and Error-Free DNA-Based Data Storage
- Computer ScienceScientific Reports
- 2017
This work represents the only known random access DNA-based data storage system that uses error-prone nanopore sequencers, while still producing error-free readouts with the highest reported information rate/density.
Scaling up DNA data storage and random access retrieval
- Computer SciencebioRxiv
- 2017
A novel coding scheme is developed that dramatically reduces the physical redundancy (sequencing read coverage) required for error-free decoding to a median of 5x, while maintaining levels of logical redundancy comparable to the best prior codes.
A Characterization of the DNA Data Storage Channel
- Computer ScienceScientific Reports
- 2019
It is found that errors within molecules are mainly due to synthesis and sequencing, while imperfections in handling and storage lead to a significant loss of sequences.
Data storage in DNA with fewer synthesis cycles using composite DNA letters
- Computer ScienceNature Biotechnology
- 2019
The development of encoding and decoding methods that exploit information redundancy using composite DNA letters, a representation of a position in a sequence that consists of a mixture of all four DNA nucleotides in a predetermined ratio are reported.
Towards practical, high-capacity, low-maintenance information storage in synthesized DNA
- Computer ScienceNature
- 2013
Theoretical analysis indicates that the DNA-based storage scheme could be scaled far beyond current global information volumes and offers a realistic technology for large-scale, long-term and infrequently accessed digital archiving.
Towards Practical and Robust DNA-based Data Archiving by Codec System Named ‘Yin-Yang’
- Computer Science
- 2019
This paper proposes a robust DNA-based data storage method based on a new codec algorithm, namely ‘Yin-Yang’, which exhibits great potential at achieving high storing capacity per nucleotide (230 PB/gram) and high fidelity of data recovery.
DNA Fountain enables a robust and efficient storage architecture
- Computer Science, BiologyScience
- 2017
A storage strategy that is highly robust and approaches the information capacity per nucleotide, and a perfect retrieval from a density of 215 petabytes per gram of DNA, orders of magnitude higher than previous reports are reported.
Trellis BMA: Coded Trace Reconstruction on IDS Channels for DNA Storage
- Computer Science2021 IEEE International Symposium on Information Theory (ISIT)
- 2021
Trellis BMA is introduced, a new reconstruction algorithm whose complexity is linear in the number of traces, and its performance is compared to previous algorithms to show that it reduces the error rate on both simulated and experimental data.