Abstract
The DNA regulatory code of gene expression is encoded in the gene regulatory structure spanning the coding and adjacent non-coding regulatory DNA regions. Deciphering this regulatory code, and how the whole gene structure interacts to produce mRNA transcripts and regulate mRNA abundance, can greatly improve our capabilities for controlling gene expression. Here, we consider that natural systems offer the most accurate information on gene expression regulation and apply deep learning on over 20,000 mRNA datasets to learn the DNA encoded regulatory code across a variety of model organisms from bacteria to Human [1]. We find that up to 82% of variation of gene expression is encoded in the gene regulatory structure across all model organisms. Coding and regulatory regions carry both overlapping and new, orthogonal information, and additively contribute to gene expression prediction. By mining the gene expression models for the relevant DNA regulatory motifs, we uncover that motif interactions across the whole gene regulatory structure define over 3 orders of magnitude of gene expression levels. Finally, we experimentally verify the usefulness of our AI-guided approach for protein expression engineering. Our results suggest that single motifs or regulatory regions might not be solely responsible for regulating gene expression levels. Instead, the whole gene regulatory structure, which contains the DNA regulatory grammar of interacting DNA motifs across the protein coding and non-coding regulatory regions, forms a coevolved transcriptional regulatory unit. This provides a solution by which whole gene systems with pre-specified expression patterns can be designed.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.