Secondary structure elements, such as alpha helices and beta strands, play a fundamental role in defining the overall fold of a protein. Leveraging secondary structure information is essential for encoding the structural features in coarse-grained protein models. Such models simplify the representation of amino acid residues, thereby reducing computational complexity. By incorporating accurate (even if only partial) secondary structure data, the models can efficiently search for the native conformation of proteins and preserve the core structural motifs across extended time frames. Here, the pivotal role of (predicted) secondary structure data in the coarse-grained modeling of protein tertiary and quaternary structures, along with their long-time dynamics, is investigated. Computational simulations of large protein systems using a low-resolution SURPASS model were performed. These case studies demonstrate the sufficiency of predicted secondary structure data in an accurate fold assembly. It leads to a realistic depiction of long-time dynamics in the recorded pseudo-trajectories by employing the Monte Carlo dynamics sampling schema, based on a long random sequence of local conformational modifications. This approach may provide a powerful tool for investigating the critical stages of protein folding. Future combination with knowledge-based potentials derived using machine learning techniques offers exciting opportunities to unravel the underlying mechanisms of biological processes in a variety of molecular complexes.
Read full abstract