Abstract

Datasets representing the conformational landscapes of protein structures are high-dimensional and hence present computational challenges. Efficient and effective dimensionality reduction of these datasets is therefore paramount to our ability to analyze the conformational landscapes of proteins and extract important information regarding protein folding, conformational changes, and binding. Representing the structures with fewer attributes that capture the most variance in the data makes for a quicker and more precise analysis of these structures. In this study, we make use of dimensionality reduction methods for reducing the number of instances and for feature reduction. The reduced dataset that is obtained is then subjected to topological and quantitative analysis. In this step, we perform hierarchical clustering to obtain different sets of conformation clusters that may correspond to intermediate structures. The structures represented by these conformations are then analyzed by studying their high-dimensional topological properties to identify truly distinct conformations and holes in the conformational space that may represent high energy barriers. Our results show that the clusters closely follow known experimental results about intermediate structures as well as binding and folding events. Doi: 10.28991/HEF-SP2022-01-01 Full Text: PDF

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call