Abstract

We describe properties and constructions of constraint-based codes for DNA-based data storage which account for the maximum repetition length and AT/GC balance. Generating functions and approximations are presented for computing the number of sequences with maximum repetition length and AT/GC balance constraint. We describe routines for translating binary runlength limited and/or balanced strings into DNA strands, and compute the efficiency of such routines. Expressions for the redundancy of codes that account for both the maximum repetition length and AT/GC balance are derived.

Highlights

  • The first large-scale archival DNA-based storage architecture was implemented by Church et al [1] in 2012

  • We describe properties and constructions of quaternary constraint-based codes for DNA-based storage which account for a maximum homopolymer run and maximum unbalance between AT and GC contents

  • We show that constrained binary sequences can be translated into constrained quaternary sequences, which opens the door to a wealth of efficient binary code constructions for application in DNA-based storage [10]–[13]

Read more

Summary

INTRODUCTION

The first large-scale archival DNA-based storage architecture was implemented by Church et al [1] in 2012. Recent examples of experimental work on DNA-base storage can be found in [4]–[6]. Blawat’s format [2] incorporates a constrained code that uses a look-up table for translating binary source data into strands of nucleotides with a homopolymer run of length at most three. We describe properties and constructions of quaternary constraint-based codes for DNA-based storage which account for a maximum homopolymer run and maximum unbalance between AT and GC contents. We show that constrained binary sequences can be translated into constrained quaternary sequences, which opens the door to a wealth of efficient binary code constructions for application in DNA-based storage [10]–[13]. Cai: Properties and Constructions of Constrained Codes for DNA-Based Data Storage.

MAXIMUM RUNLENGTH CONSTRAINT
COUNTING QUATERNARY RLL SEQUENCES OF GIVEN WEIGHT
REDUNDANCY OF BINARY AND QUATERNARY CODES
Findings
CONCLUSION

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.