• Corpus ID: 237371678

Deep DNA Storage: Scalable and Robust DNA Storage via Coding Theory and Deep Learning

@article{BarLev2021DeepDS,
  title={Deep DNA Storage: Scalable and Robust DNA Storage via Coding Theory and Deep Learning},
  author={Dan Bar-Lev and Itai Orr and Omer Sabary and Tuvi Etzion and Eitan Yaakobi},
  journal={ArXiv},
  year={2021},
  volume={abs/2109.00031}
}
The concept of DNA storage was first suggested in 1959 by Richard Feynman who shared his vision regarding nanotechnology in the talk “There is plenty of room at the bottom”. Later, towards the end of the 20-th century, the interest in storage solutions based on DNA molecules was increased as a result of the human genome project which in turn led to a significant progress in sequencing and assembly methods. DNA storage enjoys major advantages over the well-established magnetic and optical… 

Figures and Tables from this paper

Single-Read Reconstruction for DNA Data Storage Using Transformers
TLDR
This work proposes a novel approach for single-read reconstruction using an encoder-decoder Transformer architecture for DNA based data storage and achieves lower error rates when reconstructing the original data from a single read of each DNA strand compared to state-of-the-art algorithms using 2-3 copies.
On The Decoding Error Weight of One or Two Deletion Channels
TLDR
This paper studies optimal decoding for a special case of the deletion channel, referred by the k-deletion channel, which deletes exactly k symbols of the transmitted word uniformly at random, to understand how an optimal decoder operates in order to minimize the expected normalized distance.
The Input and Output Entropies of the k-Deletion/Insertion Channel
TLDR
For both the 1-insertion and 1-deletion channels, it is proved that among all words with a fixed number of runs, the input entropy is minimized for words withA skewed distribution of their run lengths and it is maximized for wordsWith a balanced distribution oftheir run lengths.

References

SHOWING 1-10 OF 48 REFERENCES
Single-Read Reconstruction for DNA Data Storage Using Transformers
TLDR
This work proposes a novel approach for single-read reconstruction using an encoder-decoder Transformer architecture for DNA based data storage and achieves lower error rates when reconstructing the original data from a single read of each DNA strand compared to state-of-the-art algorithms using 2-3 copies.
Portable and Error-Free DNA-Based Data Storage
TLDR
This work represents the only known random access DNA-based data storage system that uses error-prone nanopore sequencers, while still producing error-free readouts with the highest reported information rate/density.
Scaling up DNA data storage and random access retrieval
TLDR
A novel coding scheme is developed that dramatically reduces the physical redundancy (sequencing read coverage) required for error-free decoding to a median of 5x, while maintaining levels of logical redundancy comparable to the best prior codes.
A Characterization of the DNA Data Storage Channel
TLDR
It is found that errors within molecules are mainly due to synthesis and sequencing, while imperfections in handling and storage lead to a significant loss of sequences.
Data storage in DNA with fewer synthesis cycles using composite DNA letters
TLDR
The development of encoding and decoding methods that exploit information redundancy using composite DNA letters, a representation of a position in a sequence that consists of a mixture of all four DNA nucleotides in a predetermined ratio are reported.
Towards practical, high-capacity, low-maintenance information storage in synthesized DNA
TLDR
Theoretical analysis indicates that the DNA-based storage scheme could be scaled far beyond current global information volumes and offers a realistic technology for large-scale, long-term and infrequently accessed digital archiving.
Towards Practical and Robust DNA-based Data Archiving by Codec System Named ‘Yin-Yang’
TLDR
This paper proposes a robust DNA-based data storage method based on a new codec algorithm, namely ‘Yin-Yang’, which exhibits great potential at achieving high storing capacity per nucleotide (230 PB/gram) and high fidelity of data recovery.
DNA Fountain enables a robust and efficient storage architecture
TLDR
A storage strategy that is highly robust and approaches the information capacity per nucleotide, and a perfect retrieval from a density of 215 petabytes per gram of DNA, orders of magnitude higher than previous reports are reported.
Trellis BMA: Coded Trace Reconstruction on IDS Channels for DNA Storage
TLDR
Trellis BMA is introduced, a new reconstruction algorithm whose complexity is linear in the number of traces, and its performance is compared to previous algorithms to show that it reduces the error rate on both simulated and experimental data.
...
1
2
3
4
5
...