Abstract
Understanding the protein folding mechanism remains a grand challenge in structural biology. In the past several years, computational theories in molecular dynamics have been employed to shed light on the folding process. Coupled with high computing power and large scale storage, researchers now can computationally simulate the protein folding process in atomistic details at femtosecond temporal resolution. Such simulation often produces a large number of folding trajectories, each consisting of a series of 3D conformations of the protein under study. As a result, effectively managing and analyzing such trajectories is becoming increasingly important.In this article, we present a spatio-temporal mining approach to analyze protein folding trajectories. It exploits the simplicity of contact maps, while also integrating 3D structural information in the analysis. It characterizes the dynamic folding process by first identifying spatio-temporal association patterns in contact maps, then studying how such patterns evolve along a folding trajectory. We demonstrate that such patterns can be leveraged to summarize folding trajectories, and to facilitate the detection and ordering of important folding events along a folding path. We also show that such patterns can be used to identify a consensus partial folding pathway across multiple folding trajectories. Furthermore, we argue that such patterns can capture both local and global structural topology in a 3D protein conformation, thereby facilitating effective structural comparison amongst conformations.We apply this approach to analyze the folding trajectories of two small synthetic proteins-BBA5 and GSGS (or Beta3S). We show that this approach is promising towards addressing the above issues, namely, folding trajectory summarization, folding events detection and ordering, and consensus partial folding pathway identification across trajectories.
Highlights
The three dimensional (3D) native structures of proteins have important implications in proteomics
We realize the notion of Spatial Object Association Pattern (SOAP) to effectively capture spatial relationships among such objects, by associating spatial object association patterns (SOAPs) with proteins in different protein classes, we have identified multiple types of SOAPs that can potentially function as the structural fingerprints for different protein classes
We would like to identify a sub-sequence of similar conformations across trajectories. This sub-sequence of conformations is referred to as the consensus partial folding pathway. This is analogous to the Longest Common Sub-sequence (LCS) problem [17], but much more challenging due to the following reasons
Summary
The three dimensional (3D) native structures of proteins have important implications in proteomics. Understanding such structures enables us to explore the function of a protein, explain substrate and ligand binding, perform realistic drug design and potentially cure diseases caused by protein misfolding. The protein folding problem is one of the most fundamental yet unsolved problems in computational molecular biology. One major challenge in simulating the protein folding process is its complexity. Snow et al state that performing a Molecular. Dynamics (MD) simulation on a mini-protein for just 10. Researchers in the Folding@home project recently proposed a World Wide Web-based computing model to simulate the protein folding process [2]
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have