Abstract

Machine learning (ML) methods have had a broad and tremendous impact on structural and dynamical studies of different classes of proteins. Nonetheless, the application of ML models to protein-membrane interactions have received less attention. Given their crucial importance to several aspects of cell communication, we focus our attention to peripheral proteins. In this work, we develop a novel tokenization algorithm for protein-membrane complexes. This method accounts for spatial relationships between the membrane-bound phase of a protein and the lipid bilayer. With this tokenization method, we prepare an autoencoder-based ML workflow for predicting contact maps between residues in a peripheral membrane-binding protein and select key lipids in a lipid bilayer. Our ML model is trained with a series of molecular dynamics (MD) simulations including different PIP-binding proteins and lipid bilayers including PIP lipids. Then, we seek to predict the contact map of PIP-binding proteins, not present in the training set, and verify the predictions of the ML model by MD results. A key element of our tokenization scheme is the embedding of distances, between the protein's membrane-bound phase residues and lipids in the bilayer; we believe this provides a productive feature space for our ML workflow. Details of the ML workflow and results will be discussed. The tokenization method we employ for protein-membrane interactions may facilitate developing applications of previously un-tested ML models in problems regarding protein-membrane interactions. Furthermore, our results for contact map prediction serve as a motivation to further explore ML applications for systems containing a lipid bilayer.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call