Under certain conditions, RNA repeat sequences phase separate, yielding protein-free biomolecular condensates. Importantly, RNA repeat sequences have also been implicated in neurological disorders, such as Huntington's disease. Thus, mapping repeat sequences to their phase behavior, functions, and dysfunctions is an active area of research. However, despite several advances, it remains challenging to characterize the RNA phase behavior at a submolecular resolution. Here, we have implemented a residue-resolution coarse-grained model in LAMMPS─that incorporates both the RNA sequence and structure─to study the clustering propensities of protein-free RNA systems. Importantly, we achieve a multifold speedup in the simulation time compared to previous work. Leveraging this efficiency, we study the clustering propensity of all 20 nonredundant trinucleotide repeat sequences. Our results align with findings from experiments, emphasizing that canonical base-pairing and G-U wobble pairs play dominant roles in regulating cluster formation of RNA repeat sequences. Strikingly, we find strong entropic contributions to the stability and composition of RNA clusters, which is demonstrated for single-component RNA systems as well as binary mixtures of trinucleotide repeats. Additionally, we investigate the clustering behaviors of trinucleotide (odd) repeats and their quadranucleotide (even) counterparts. We observe that odd repeats exhibit stronger clustering tendencies, attributed to the presence of consecutive base pairs in their sequences that are disrupted in even repeat sequences. Altogether, our work extends the set of computational tools for probing RNA cluster formation at submolecular resolution and uncovers physicochemical principles that govern the stability and composition of the resulting clusters.
Read full abstract