Abstract

Understanding the protein-folding process is an outstanding issue in biophysics; recent developments in molecular dynamics simulation have provided insights into this phenomenon. However, the large freedom of atomic motion hinders the understanding of this process. In this study, we applied persistent homology, an emerging method to analyze topological features in a data set, to reveal protein-folding dynamics. We developed a new, to our knowledge, method to characterize the protein structure based on persistent homology and applied this method to molecular dynamics simulations of chignolin. Using principle component analysis or nonnegative matrix factorization, our analysis method revealed two stable states and one saddle state, corresponding to the native, misfolded, and transition states, respectively. We also identified an unfolded state with slow dynamics in the reduced space. Our method serves as a promising tool to understand the protein-folding process.

Highlights

  • Since the proposal of Levinthal’s paradox in 1968, the folding of biomolecules, including proteins, has attracted the interest of numerous scientists [1]

  • We calculated the topological feature vector’’ (TFV) from snapshots of chignolin and reduced the configuration into low dimensional spaces by principal component analysis (PCA) and nonnegative matrix factorization (NMF)

  • The first principal component does not contribute to the cluster identification. These results indicate that the PCA is strongly affected by structural changes, which are not related to the transition between the folded and misfolded states

Read more

Summary

INTRODUCTION

Since the proposal of Levinthal’s paradox in 1968, the folding of biomolecules, including proteins, has attracted the interest of numerous scientists [1]. In a small protein that has only one b-sheet, a small change in the bond angle at the hairpin of the molecule may disrupt the b-sheet structure This small change in the angle results in a large deformation. We propose using topological data analysis (TDA) to characterize the structure and deformation of a protein. TDA has several advantages compared with standard protein structure analysis tools such as Ramachandran plots, distance matrices, and the atomic Cartesian coordinates. Escolar et al developed a method to calculate ‘‘volume-optimal cycles,’’ which enables identification of the atoms that form loops or cavities [17,18] This method is useful to explain PH results and has revealed hidden structures in glass and amorphous polymers [15,16]. The challenges to overcome, as well as the future direction, are discussed in Conclusions

METHODS
RESULTS AND DISCUSSION
CONCLUSIONS
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call