The computational design of synthetic DNA sequences with designer in vivo properties is gaining traction in the field of synthetic genomics. We propose here a computational method which combines a kinetic Monte Carlo framework with a deep mutational screening based on deep learning predictions. We apply our method to build regular nucleosome arrays with tailored nucleosomal repeat lengths (NRL) in yeast. Our design was validated in vivo by successfully engineering and integrating thousands of kilobases long tandem arrays of computationally optimized sequences which could accommodate NRLs much larger than the yeast natural NRL (namely 197 and 237bp, compared to the natural NRL of ∼165bp). RNA-seq results show that transcription of the arrays can occur but is not driven by the NRL. The computational method proposed here delineates the key sequence rules for nucleosome positioning in yeast and should be easily applicable to other sequence properties and other genomes.
Read full abstract