Structures can now be predicted for any protein using programs like AlphaFold and Rosetta, which rely on a foundation of experimentally determined structures of architecturally diverse proteins. The accuracy of such artificial intelligence and machine learning (AI/ML) approaches benefits from the specification of restraints which assist in navigating the universe of folds to converge on models most representative of a given protein's physiological structure. This is especially pertinent for membrane proteins, with structures and functions that depend on their presence in lipid bilayers. Structures of proteins in their membrane environments could conceivably be predicted from AI/ML approaches with user-specificized parameters that describe each element of the architecture of a membrane protein accompanied by its lipid environment. We propose the Classification Of Membrane Proteins based On Structures Engaging Lipids (COMPOSEL), which builds on existing nomenclature types for monotopic, bitopic, polytopic and peripheral membrane proteins as well as lipids. Functional and regulatory elements are also defined in the scripts, as shown with membrane fusing synaptotagmins, multidomain PDZD8 and Protrudin proteins that recognize phosphoinositide (PI) lipids, the intrinsically disordered MARCKS protein, caveolins, the β barrel assembly machine (BAM), an adhesion G-protein coupled receptor (aGPCR) and two lipid modifying enzymes - diacylglycerol kinase DGKε and fatty aldehyde dehydrogenase FALDH. This demonstrates how COMPOSEL communicates lipid interactivity as well as signaling mechanisms and binding of metabolites, drug molecules, polypeptides or nucleic acids to describe the operations of any protein. Moreover COMPOSEL can be scaled to express how genomes encode membrane structures and how our organs are infiltrated by pathogens such as SARS-CoV-2.
Read full abstract