Abstract

In the beautifully simple-to-state problem of trace reconstruction, the goal is to reconstruct an unknown binary string x given random “traces” of x where each trace is generated by deleting each coordinate of x independently with probability p <; 1. The problem is well studied both when the unknown string is arbitrary and when it is chosen uniformly at random. For both settings, there is still an exponential gap between upper and lower sample complexity bounds and our understanding of the problem is still surprisingly limited. In this paper, we consider natural parameterizations and generalizations of this problem in an effort to attain a deeper and more comprehensive understanding. Perhaps our most surprising results are: 1) We prove that exp(O(n <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1/4</sup> √{logn})) traces suffice for reconstructing arbitrary matrices. In the matrix version of the problem, each row and column of an unknown √n×√n matrix is deleted independently with probability p. Our results contrasts with the best known results for sequence reconstruction where the best known upper bound is exp(O(n <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1/3</sup> )). 2) An optimal result for random matrix reconstruction: we show that Θ(logn) traces are necessary and sufficient. This is in contrast to the problem for random sequences where there is a super-logarithmic lower bound and the best known upper bound is exp(O(log <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1/3</sup> n)). 3) We show that exp(O(k <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1/3</sup> log <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2/3</sup> n)) traces suffice to reconstruct k-sparse strings, providing an improvement over the best known sequence reconstruction results when k = o(n/log <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> n). 4) We show that poly(n) traces suffice if x is k-sparse and we additionally have a “separation” promise, specifically that the indices of 1's in x all differ by Ω(k logn).

Highlights

  • In the trace reconstruction problem, first proposed by Batu et al [4], the goal is to reconstruct an unknown string x ∈ {0, 1}n given a set of random subsequences of x

  • An optimal result for random matrix reconstruction: we show that Θ(log n) traces are necessary and sufficient

  • We begin by considering parameterizations of the trace reconstruction problem

Read more

Summary

Introduction

In the trace reconstruction problem, first proposed by Batu et al [4], the goal is to reconstruct an unknown string x ∈ {0, 1}n given a set of random subsequences of x. The central question is to find how many traces are required to exactly reconstruct x with high probability This intriguing problem has attracted significant attention from a large number of researchers [4, 8, 10, 11, 15, 17, 18, 21, 24, 26,27,28]. De et al [11] and Nazarov and Peres [26] independently showed that exp(O((n/q)1/3)) traces suffice where q = 1 − p. This bound is achieved by a mean-based algorithm, which means that the only information used is the fraction of traces that have a 1 in each position. In studying these settings, we refine existing tools and introduce new techniques that we believe may be helpful in closing the gaps in the fully general problem

Our Results
Our Techniques
Sparsity and Learning Binomial Mixtures
Well-Separated Sequences
A Recursive Hierarchical Clustering Algorithm and Its Analysis
Strengthening to a Parameterization by Runs
Sparsity with Gap
Reconstructing Arbitrary Matrices
Reconstructing Random Matrices
Oracle
Bounded Hamming Distance
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.