Abstract

Polymer-based data-storage platforms use chains of binary synthetic polymers as recording media and read the content via tandem mass spectrometers. For such systems, we propose the first known family of codes that allows for both unique string reconstruction and correction of multiple mass errors. We consider two approaches: The first approach pertains to asymmetric error-correction and it is based on introducing redundancy that scales linearly with the number of errors and logarithmically with the length of the string. The construction allows for the string to be uniquely reconstructed based only on its erroneous substring composition multiset. The key idea behind our unique reconstruction approach is to interleave (shifted) Catalan-Bertrand strings with arbitrary binary strings and “reflect” them so as to force prefixes and suffixes of the same length to have different weights. The asymptotic code rate of the scheme is one, and decoding is accomplished via a simplified version of the Backtracking algorithm used for the Turnpike problem. For symmetric errors, we use a polynomial characterization of the mass information and adapt polynomial evaluation code constructions for this setting. In the process, we develop new efficient decoding algorithms for a constant number of composition errors.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call