CRISPR-based genome engineering holds enormous promise for basic science and therapeutic applications. Integrating and editing DNA sequences is still challenging in many cellular contexts, largely due to insufficient control of the repair process. We find that repair at the genome-cargo interface is predictable by deep-learning models and adheres to sequence context specific rules. Based on in silico predictions, we devised a strategy of triplet base-pair repeat repair arms that correspond to microhomologies at double-strand breaks (trimologies), which facilitated integration of large cargo (>2 kb) and protected the targeted locus and transgene from excessive damage. Successful integrations occurred in >30 loci in human cells and in in vivo models. Germline transmissible transgene integration in Xenopus, and endogenous tagging of tubulin in adult mice brains demonstrated integration during early embryonic cleavage and in non-dividing differentiated cells. Further, optimal repair arms for single- or double nucleotide edits were predictable, and facilitated small edits in vitro and in vivo using oligonucleotide templates. We provide a design-tool (Pythia, pythia-editing.org) to optimize custom integration, tagging or editing strategies. Pythia will facilitate genomic integration and editing for experimental and therapeutic purposes for a wider range of target cell types and applications.
Read full abstract