Abstract

Efficiently and accurately analyzing high-dimensional time series, such as the molecular dynamics (MD) trajectory of biomolecules, is a long-standing and intriguing task. Two different but related techniques, i.e., dimension reduction methods and clustering algorithms, have been developed and applied widely in this field. Here we show that the combination of these techniques enables further improvement of the analyses, especially with very complicated data. Specifically, we present an approach that combines the trajectory mapping (TM) method, which constructs slow collective variables of a time series, with density peak clustering (DPC) [A. Rodriguez and A. Laio, Science 344, 1492 (2014)SCIEAS0036-807510.1126/science.1242072], which identifies similar data points to form clusters in a static data set. We illustrate the application of the TMDPC approach with hundreds of microseconds of all-atomic MD trajectories of two proteins, the villin headpiece and protein G. The results show that TMDPC is a powerful tool for achieving the metastable states and slow dynamics of these high-dimensional time series due to the efficient consideration of the time successiveness and the geometric distances between data points.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.