Abstract

Lineage-tracing technologies based on Clustered Regularly Interspaced Short Palindromic Repeats and CRISPR-associated protein 9 (CRISPR-Cas9) genome editing have emerged as a powerful tool for investigating development in single-cell contexts, but exact reconstruction of the underlying clonal relationships in experiment is complicated by features of the data. These complications are functions of the experimental parameters in these systems, such as the Cas9 cutting rate, the diversity of indel outcomes, and the rate of missing data. In this paper, we develop two theoretically grounded algorithms for the reconstruction of the underlying single-cell phylogenetic tree as well as asymptotic bounds for the number of recording sites necessary for exact recapitulation of the ground truth phylogeny at high probability. In doing so, we explore the relationship between the problem difficulty and the experimental parameters, with implications for experimental design. Lastly, we provide simulations showing the empirical performance of these algorithms and showing that the trends in the asymptotic bounds hold empirically. Overall, this work provides a theoretical analysis of phylogenetic reconstruction in single-cell CRISPR-Cas9 lineage-tracing technologies.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call