Abstract

Synthetic DNA-based data storage systems have received significant attention due to the promise of ultrahigh storage density and long-term stability. However, all known platforms suffer from high cost, read-write latency and error-rates that render them noncompetitive with modern storage devices. One means to avoid the above problems is using readily available native DNA. As the sequence content of native DNA is fixed, one can modify the topology instead to encode information. Here, we introduce DNA punch cards, a macromolecular storage mechanism in which data is written in the form of nicks at predetermined positions on the backbone of native double-stranded DNA. The platform accommodates parallel nicking on orthogonal DNA fragments and enzymatic toehold creation that enables single-bit random-access and in-memory computations. We use Pyrococcus furiosus Argonaute to punch files into the PCR products of Escherichia coli genomic DNA and accurately reconstruct the encoded data through high-throughput sequencing and read alignment.

Highlights

  • Synthetic DNA-based data storage systems have received significant attention due to the promise of ultrahigh storage density and long-term stability

  • Known nickases are only able to detect and bind specific sequences in DNA strands that tend to be highly restricted by their context

  • NGS readout and alignment experiments: As a proof of concept, we report write-read results for two compressed files, containing a 272-word text file of size 0.4 KB containing Lincoln’s Gettysburg Address (LGA) and a JPEG image of the Lincoln Memorial of size 14 KB

Read more

Summary

Introduction

Synthetic DNA-based data storage systems have received significant attention due to the promise of ultrahigh storage density and long-term stability. The information stored in nicks can be retrieved in an error-free manner using NGS technologies, similar to synthesis-based approaches This is accomplished through alignment of DNA fragments obtained through the nicking process to the known reference genomic DNA strands. Due to the availability of the reference even very small fragment coverages lead to error-free readouts, which is an important feature of the system Alternative readout approaches, such as non-destructive solid-state nanopore sequencing, can be used instead, provided that further advancement of the related technologies enable high readout precision. Nick-based storage allows for introducing a number of additional functionalities into the storage system, such as bitwise random access and pooling—both reported in this work—and inmemory computing solutions reported in a follow-up paper[15] These features come at the cost of reduced storage density which is roughly 50-fold smaller than that of synthetic DNA-based platforms. Topological alterations in DNA in the form of secondary structure may not be cost- or density-efficient as additional nucleotides are needed to create such structures[16]

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.