Abstract

The constant emergence of COVID-19 variants reduces the effectiveness of existing vaccines and test kits. Therefore, it is critical to identify conserved structures in severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) genomes as potential targets for variant-proof diagnostics and therapeutics. However, the algorithms to predict these conserved structures, which simultaneously fold and align multiple RNA homologs, scale at best cubically with sequence length and are thus infeasible for coronaviruses, which possess the longest genomes (∼30,000 nt) among RNA viruses. As a result, existing efforts on modeling SARS-CoV-2 structures resort to single-sequence folding as well as local folding methods with short window sizes, which inevitably neglect long-range interactions that are crucial in RNA functions. Here we present LinearTurboFold, an efficient algorithm for folding RNA homologs that scales linearly with sequence length, enabling unprecedented global structural analysis on SARS-CoV-2. Surprisingly, on a group of SARS-CoV-2 and SARS-related genomes, LinearTurboFold's purely in silico prediction not only is close to experimentally guided models for local structures, but also goes far beyond them by capturing the end-to-end pairs between 5' and 3' untranslated regions (UTRs) (∼29,800 nt apart) that match perfectly with a purely experimental work. Furthermore, LinearTurboFold identifies undiscovered conserved structures and conserved accessible regions as potential targets for designing efficient and mutation-insensitive small-molecule drugs, antisense oligonucleotides, small interfering RNAs (siRNAs), CRISPR-Cas13 guide RNAs, and RT-PCR primers. LinearTurboFold is a general technique that can also be applied to other RNA viruses and full-length genome studies and will be a useful tool in fighting the current and future pandemics.

Highlights

  • The constant emergence of COVID-19 variants reduces the effectiveness of existing vaccines and test kits

  • LinearTurboFold is a general technique that can be applied to other RNA viruses and fulllength genome studies and will be a useful tool in fighting the current and future pandemics

  • We presented LinearTurboFold, an end-to-end linear-time algorithm for structural alignment and conserved structure prediction of RNA homologs, which is, to our knowledge, the first jointfold-and-align algorithm that scales to full-length SARS-CoV2 genomes without imposing any constraints on base-pairing distance

Read more

Summary

Introduction

The constant emergence of COVID-19 variants reduces the effectiveness of existing vaccines and test kits. The algorithms to predict these conserved structures, which simultaneously fold and align multiple RNA homologs, scale at best cubically with sequence length and are infeasible for coronaviruses, which possess the longest genomes (∼30,000 nt) among RNA viruses. We present LinearTurboFold, an efficient algorithm for folding RNA homologs that scales linearly with sequence length, enabling unprecedented global structural analysis on SARS-CoV-2. The first, “joint fold-andalign” method, seeks to simultaneously predict structures and a structural alignment for two or more sequences This was first proposed by Sankoff [15] using a dynamic programming algorithm. We present LinearTurboFold, a linear-time algorithm that is orders of magnitude faster, making it, to our knowledge, the first method to simultaneously fold and align whole genomes of severe acute respiratory syndrome coronavirus 2 (SARS-CoV2) variants, the longest known RNA virus (∼30 kb). LinearTurboFold is a general technique for full-length genome studies and can help fight the current and future pandemics

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.