The folded structures of proteins can be accurately predicted by deep learning algorithms from their amino-acid sequences. By contrast, in spite of decades of research studies, the prediction of folding pathways and the unfolded and misfolded states of proteins, which are intimately related to diseases, remains challenging. A two-state (folded/unfolded) description of protein folding dynamics hides the complexity of the unfolded and misfolded microstates. Here, we focus on the development of simplified order parameters to decipher the complexity of disordered protein structures. First, we show that any connected, undirected, and simple graph can be associated with a linear chain of atoms in thermal equilibrium. This analogy provides an interpretation of the usual topological descriptors of a graph, namely the Kirchhoff index and Randić resistance, in terms of effective force constants of a linear chain. We derive an exact relation between the Kirchhoff index and the average shortest path length for a linear graph and define the free energies of a graph using an Einstein model. Second, we represent the three-dimensional protein structures by connected, undirected, and simple graphs. As a proof of concept, we compute the topological descriptors and the graph free energies for an all-atom molecular dynamics trajectory of folding/unfolding events of the proteins Trp-cage and HP-36 and for the ensemble of experimental NMR models of Trp-cage. The present work shows that the local, nonlocal, and global force constants and free energies of a graph are promising tools to quantify unfolded/disordered protein states and folding/unfolding dynamics. In particular, they allow the detection of transient misfolded rigid states.
Read full abstract