Abstract

Efficiently and accurately analyzing high-dimensional time series, such as the molecular dynamics (MD) trajectory of biomolecules, is a long-standing and intriguing task. Two different but related techniques, i.e., dimension reduction methods and clustering algorithms, have been developed and applied widely in this field. Here we show that the combination of these techniques enables further improvement of the analyses, especially with very complicated data. Specifically, we present an approach that combines the trajectory mapping (TM) method, which constructs slow collective variables of a time series, with density peak clustering (DPC) [A. Rodriguez and A. Laio, Science 344, 1492 (2014)SCIEAS0036-807510.1126/science.1242072], which identifies similar data points to form clusters in a static data set. We illustrate the application of the TMDPC approach with hundreds of microseconds of all-atomic MD trajectories of two proteins, the villin headpiece and protein G. The results show that TMDPC is a powerful tool for achieving the metastable states and slow dynamics of these high-dimensional time series due to the efficient consideration of the time successiveness and the geometric distances between data points.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call