Predicting the solvent accessible surface area (ASA) of transmembrane (TM) residues is of great importance for experimental researchers to elucidate diverse physiological processes. TM residues fall into two major structural classes (α-helix membrane protein and β-barrel membrane protein). The reported solvent ASA prediction models were developed for these two types of TM residues respectively. However, this prevents the general use of these methods because one cannot determine which model is suitable for a given TM residue without information of its type. To conquer this limitation, we developed a new computational model that can be used for predicting the ASA of both TM α-helix and β-barrel residues. The model was developed from 78 α-helix membrane protein chains and 24 β-barrel membrane protein. Its prediction ability was evaluated by cross validation method and its prediction result on an independent test set of 20 membrane protein chains. The results show that our model performs well for both types of TM residues and outperforms other prediction model which was developed for the specific type of TM residues. The prediction results also proved that the random forest model incorporating conservation score is an effective sequence-based computational approach for predicting the solvent ASA of TM residues.
Read full abstract