Abstract
BackgroundWith the inherent high density and durable preservation, DNA has been recently recognized as a distinguished medium to store enormous data over millennia. To overcome the limitations existing in a recently reported high-capacity DNA data storage while achieving a competitive information capacity, we are inspired to explore a new coding system that facilitates the practical implementation of DNA data storage with high capacity.ResultIn this work, we devised and implemented a DNA data storage scheme with variable-length oligonucleotides (oligos), where a hybrid DNA mapping scheme that converts digital data to DNA records is introduced. The encoded DNA oligos stores 1.98 bits per nucleotide (bits/nt) on average (approaching the upper bound of 2 bits/nt), while conforming to the biochemical constraints. Beyond that, an oligo-level repeat-accumulate coding scheme is employed for addressing data loss and corruption in the biochemical processes. With a wet-lab experiment, an error-free retrieval of 379.1 KB data with a minimum coverage of 10x is achieved, validating the error resilience of the proposed coding scheme. Along with that, the theoretical analysis shows that the proposed scheme exhibits a net information density (user bits per nucleotide) of 1.67 bits/nt while achieving 91% of the information capacity.ConclusionTo advance towards practical implementations of DNA storage, we proposed and tested a DNA data storage system enabling high potential mapping (bits to nucleotide conversion) scheme and low redundancy but highly efficient error correction code design. The advancement reported would move us closer to achieving a practical high-capacity DNA data storage system.
Highlights
With the inherent high density and durable preservation, Deoxyribonucleic acid (DNA) has been recently recognized as a distinguished medium to store enormous data over millennia
We introduce an interleaver after the binary-to-quaternary mapping, which scrambles the original order of the nucleotide sequences
The limited manufacturing precision and techniques of DNA synthesis result in a low throughput and short length of valid DNA sequences, which leads to laborious work with high cost
Summary
With the inherent high density and durable preservation, DNA has been recently recognized as a distinguished medium to store enormous data over millennia. It has been predicted that the amount of data around the world will rise to 44 zettabytes by 2020 with 2.5 exabytes of daily data production [1] With this ever-increasing information in the digital world, the effective way to store enormous data with high reliability, capacity and durability has been much discussed. Traditional digital storage systems (e.g., CD, DVD, flash drivers, etc.) could provide a density of around 201 GB/in, but require a large physical space to store data with magnitude of zettabytes [2]. Another desirable characteristic of data storage is the long preservation duration. Deoxyribonucleic acid (DNA) has recently attracted much attention as its inherent features, such as high physical density and long durability, significantly accommodate the requirements of large-sized long-term storage [4]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.