Abstract

Machine learning research concerning protein structure has seen a surge in popularity over the last years with promising advances for basic science and drug discovery. Working with macromolecular structure in a machine learning context requires an adequate numerical representation, and researchers have extensively studied representations such as graphs, discretized 3D grids, and distance maps. As part of CASP14, we explored a new and conceptually simple representation in a blind experiment: atoms as points in 3D, each with associated features. These features-initially just the basic element type of each atom-are updated through a series of neural network layers featuring rotation-equivariant convolutions. Starting from all atoms, we further aggregate information at the level of alpha carbons before making a prediction at the level of the entire protein structure. We find that this approach yields competitive results in protein model quality assessment despite its simplicity and despite the fact that it incorporates minimal prior information and is trained on relatively little data. Its performance and generality are particularly noteworthy in an era where highly complex, customized machine learning methods such as AlphaFold 2 have come to dominate protein structure prediction.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call