Abstract

With improvements in computer speed and algorithm efficiency, MD simulations are sampling larger amounts of molecular and biomolecular conformations. Being able to qualitatively and quantitatively sift these conformations into meaningful groups is a difficult and important task, especially when considering the structure-activity paradigm. Here we present a study that combines two popular techniques, principal component (PC) analysis and clustering, for revealing major conformational changes that occur in molecular dynamics (MD) simulations. Specifically, we explored how clustering different PC subspaces effects the resulting clusters versus clustering the complete trajectory data. As a case example, we used the trajectory data from an explicitly solvated simulation of a bacteria’s L11·23S ribosomal subdomain, which is a target of thiopeptide antibiotics. Clustering was performed, using K-means and average-linkage algorithms, on data involving the first two to the first five PC subspace dimensions. For the average-linkage algorithm we found that data-point membership, cluster shape, and cluster size depended on the selected PC subspace data. In contrast, K-means provided very consistent results regardless of the selected subspace. Since we present results on a single model system, generalization concerning the clustering of different PC subspaces of other molecular systems is currently premature. However, our hope is that this study illustrates a) the complexities in selecting the appropriate clustering algorithm, b) the complexities in interpreting and validating their results, and c) by combining PC analysis with subsequent clustering valuable dynamic and conformational information can be obtained.Electronic supplementary materialThe online version of this article (doi:10.1007/s00894-012-1563-4) contains supplementary material, which is available to authorized users.

Highlights

  • The GTPase-associated region (GAR) on the 70S bacterial ribosome plays a central role in peptide elongation by providing a binding site for elongation factors and by coordinating GTP hydrolysis during protein synthesis [1,2,3]

  • The resulting 597×37,500 data matrix was subjected to a principal component analysis, yielding 3×199 principal components

  • The details of how the clustering is performed and how clustering can be combined with other simulation observables remains an active area of research

Read more

Summary

Introduction

The GTPase-associated region (GAR) on the 70S bacterial ribosome plays a central role in peptide elongation by providing a binding site for elongation factors and by coordinating GTP hydrolysis during protein synthesis [1,2,3]. The GAR is primarily formed by the ribosomal protein L11 and the helix 43-44 (H43/H44) substructure of the 23S ribosomal RNA. This L11·23S subdomain binds thiopeptide natural products (e.g., thiostrepton) that have antibacterial activity [4,5,6,7,8,9], making it a target for rational drug design [10, 11]. Its N-terminal domain (NTD) shows a high degree of flexibility [13,14,15,16] relative to the CTD and 23S; its differing conformational states may be critical for interaction with translation factors during protein synthesis [3, 6, 17].

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call