Abstract

The folding dynamics of proteins is a primary area of interest in protein science. We carried out topological data analysis (TDA) of the folding process of HP35(nle-nle), a double-mutant of the villin headpiece subdomain. Using persistent homology and non-negative matrix factorization, we reduced the dimension of protein structure and investigated the flow in the reduced space. We found this protein has two folding paths, distinguished by the pairings of inter-helix residues. Our analysis showed the excellent performance of TDA in capturing the formation of tertiary structure.

Highlights

  • The folding dynamics of proteins is a primary area of interest in protein science

  • We proposed a feature construction method based on persistent homology (PH), the most widely used topological data analysis (TDA) m­ ethod[10]

  • We carried out negative matrix factorization (NMF) decomposition for L = 4, 5, and 6, and we found no quantitative changes

Read more

Summary

Introduction

The folding dynamics of proteins is a primary area of interest in protein science. We carried out topological data analysis (TDA) of the folding process of HP35(nle-nle), a double-mutant of the villin headpiece subdomain. Data-scientific methods, such as principal component analysis (PCA) or k-means clustering have been used to study the folding structures. A small change in the local bending angle causes a large change in the position of all residues; linear analysis, such as PCA or k-means, sometimes provide insufficient information. To overcome this difficulty, complex procedures are often needed. Jain and Stock used a combination of dihedral PCA, k-means clustering, Markov state modeling, and hierarchical clustering to analyze the folding of protein ­HP353 Another approach is the application of non-linear analysis, such as kernel PCA, isomap, or t-distributed stochastic neighbor embedding. Yao et al analyzed the folding path of RNA using Mapper, a popular TDA m­ ethod[6]

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.