Abstract

Synthetic DNA is gaining momentum as a potential storage medium for archival data storage. In this process, digital information is translated into sequences of nucleotides and the resulting synthetic DNA strands are then stored for later retrieval. Here, we demonstrate reliable file recovery with PCR-based random access when as few as ten copies per sequence are stored, on average. This results in density of about 17 exabytes/gram, nearly two orders of magnitude greater than prior work has shown. We successfully retrieve the same data in a complex pool of over 1010 unique sequences per microliter with no evidence that we have begun to approach complexity limits. Finally, we also investigate the effects of file size and sequencing coverage on successful file retrieval and look for systematic DNA strand drop out. These findings substantiate the robustness and high data density of the process examined here.

Highlights

  • Synthetic DNA is gaining momentum as a potential storage medium for archival data storage

  • For maximum density, only one copy of each sequence would be necessary to perform the polymerase chain reaction (PCR) random access reaction. This is not the case for two reasons: stochastic variations in copy numbers that arise from sub-sampling the pool during random access, and copy number variations that arise from synthesis

  • By demonstrating files with a copy number of approximately 10 can be successfully recovered, we present the most practically dense system to date at 17 EB g−1, nearly two orders of magnitude greater than the Calculating margin of error for copy numbers

Read more

Summary

Introduction

Synthetic DNA is gaining momentum as a potential storage medium for archival data storage. We demonstrate reliable file recovery with PCR-based random access when as few as ten copies per sequence are stored, on average This results in density of about 17 exabytes/gram, nearly two orders of magnitude greater than prior work has shown. This work further supports the robustness and high density storage potential of DNA, for we demonstrate we have not yet reached the limit of permissible pool complexity, and with a minimum copy number of 10 we show this process yields the densest DNA storage system to date at 17 exabytes per gram (EB g−1) Previous work in this space recognized the importance of storage density for DNA to become a practical archival storage[1,3,7,10], but the greatest complexity surrounding random access in those works reached just over 107 unique sequences[10].

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call