Abstract

Time-lagged independent component analysis (tICA) is a widely used dimension reduction method for the analysis of molecular dynamics (MD) trajectories and has proven particularly useful for the construction of protein dynamics Markov models. It identifies those “slow” collective degrees of freedom onto which the projections of a given trajectory show maximal autocorrelation for a given lag time. Here we ask how much information on the actual protein dynamics and, in particular, the free energy landscape that governs these dynamics the tICA-projections of MD-trajectories contain, as opposed to noise due to the inherently stochastic nature of each trajectory. To answer this question, we have analyzed the tICA-projections of high dimensional random walks using a combination of analytical and numerical methods. We find that the projections resemble cosine functions and strongly depend on the lag time, exhibiting strikingly complex behavior. In particular, and contrary to previous studies of principal component projections, the projections change noncontinuously with increasing lag time. The tICA-projections of selected 1 μs protein trajectories and those of random walks are strikingly similar, particularly for larger proteins, suggesting that these trajectories contain only little information on the energy landscape that governs the actual protein dynamics. Further the tICA-projections of random walks show clusters very similar to those observed for the protein trajectories, suggesting that clusters in the tICA-projections of protein trajectories do not necessarily reflect local minima in the free energy landscape. We also conclude that, in addition to the previous finding that certain ensemble properties of nonconverged protein trajectories resemble those of random walks; this is also true for their time correlations.

Highlights

  • The atomistic dynamics of proteins, protein complexes, and other biomolecules is exceedingly complex, covering time scales from subpicoseconds to up to hours.[1,2] It is governed by a complex high-dimensional free energy landscape or funnel,[3] characterized by a hierarchy of free energy barriers,[4] and has been widely studied computationally by molecular dynamics (MD) simulations.[5]

  • Most notable approaches are principal component analysis (PCA) to extract the essential dynamics[13] of the protein that contributes most to the atomic fluctuations, and time-lagged independent component analysis, which identifies those collective degrees of freedom that exhibit the strongest time-correlations for a given lag-time.[14,15]

  • Both dimension reduction techniques can yield information on the conformational dynamics of a protein, that is, how the protein moves through several conformational substates, which can be defined as metastable conformations characterized by local free energy minima.[16]

Read more

Summary

INTRODUCTION

The atomistic dynamics of proteins, protein complexes, and other biomolecules is exceedingly complex, covering time scales from subpicoseconds to up to hours.[1,2] It is governed by a complex high-dimensional free energy landscape or funnel,[3] characterized by a hierarchy of free energy barriers,[4] and has been widely studied computationally by molecular dynamics (MD) simulations.[5]. Providing a very powerful criterion for the convergence of MD trajectories: The more an MD trajectory resembles a cosine, quantified by the cosine content,[21] the more it resembles a random walk, and the less information it contains on the actual protein dynamics or the underlying free energy landscape These analyses[21,22] have suggested that clusters observed in low-dimensional PCA projections do not necessarily imply the existence of conformational substates and, instead, may be a stochastic and/or projection artifact. The resulting much richer and more intricate structure of random walk projections renders the proper interpretation of tICA-projections of protein dynamics trajectories challenging, and has profound implications for the proper constructions of Markov models

THEORETICAL ANALYSIS AND METHODS
RESULTS AND DISCUSSION
CONCLUSIONS
■ ACKNOWLEDGMENTS
■ REFERENCES
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call