Abstract

Due to the advantages of high information densities and longevity, DNA storage systems have begun to attract a lot of attention. However, common obstacles to DNA storage are caused by insertion, deletion, and substitution errors occurring in DNA synthesis and sequencing. In this paper, we first explain a method to convert binary data into general maximum run-length $r$ sequences with specific length construction, which can be used as the message sequence of our proposed code. Then, we propose a new single insertion/deletion nonbinary systematic error correction code and its corresponding encoding algorithm. For the proposed code, we design the fixed maximum run-length $r$ in the parity sequence of the proposed code to be three. Additionally, the last parity symbol and the first message symbol are always different. Hence, the overall maximum run-length $r$ of the output codeword is guaranteed to be three when the maximum run-length of the message sequence is three. Finally, we determine the feasibility of the proposed encoding algorithm, verify successful decoding when a single insertion/deletion error occurs in the codeword, and present the comparison results with relevant works.

Highlights

  • As people gradually rely on more and more data, the hardware of data storage systems has been gradually upgraded

  • We propose a new nonbinary single insertion/deletion error correction (SIDEC) code with the maximum run-length r constraint and systematic encoding algorithm

  • We present an application of a codeword where the maximum run-length r is three for DNA storage

Read more

Summary

INTRODUCTION

As people gradually rely on more and more data, the hardware of data storage systems has been gradually upgraded. A new binary SIDEC code combined with the maximum runlength r constrained code and efficient systematic encoding algorithm was proposed in [18] All of these studies [8], [9], [11], [12], [17] focused on binary coding schemes for DNA storage systems. Insertion and deletion errors are inevitably bound to occur in the process of DNA synthesis and sequencing For these purposes, we propose a new nonbinary SIDEC code with the maximum run-length r constraint and systematic encoding algorithm. Simulation results show that the encoding algorithm is feasible and the q-ary SIDEC code with the maximum runlength r constraint can correct a single deletion or insertion error.

PRELIMINARIES
PROPOSED CODE CONSTRUCTION AND ALGORITHM
CODE CONSTRUCTION AND ENCODING FOR A
CODE CONSTRUCTION
ENCODING ALGORITHM
3: Step 3
DECODING FOR AN INSERTION OR DELETION ERROR
SIMULATION AND COMPARISON RESULTS
Findings
CONCLUSION AND FUTURE WORK
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call