Predicting the backbone torsion angles corresponding to each residue of a protein from its amino acid sequence alone is a challenging problem in computational biology. Existing torsion angle predictors mainly use profile features, which are generated by performing time-consuming multiple sequence alignments, for torsion angle prediction. Compared with traditional profile features, embedding features from pretrained protein language models have significant advantages in prediction performance and computational speed. However, embedding features usually have higher dimensions and different embedding features have significantly different dimensions. To this end, we design a novel parameter-efficient deep torsion angle predictor, PHAngle, specifically for embedding features. PHAngle is a parameterized hypercomplex convolutional network consisting of parameterized hypercomplex linear and convolutional layers whose weight parameters can be characterized as the sum of Kronecker products. Experimental results on six benchmark test sets including TEST2016, TEST2018, TEST2020_HQ, CASP12, CASP13 and CASP-FM demonstrate that PHAngle achieves the state-of-the-art torsion angle performance with the fewest parameters compared to the nine existing methods. The source code and datasets are available at https://github.com/fengtuan/PHAngle.
Read full abstract