Abstract
This paper focuses on the prediction of the dimensionless retention time of proteins (DRT) in hydrophobic interaction chromatography (HIC) by means of mathematical models based, essentially, only on aminoacidic composition. The results show that such prediction is indeed possible. Our main contribution was the design of models that predict the DRT using the minimal information concerning a protein: its aminoacidic composition. The performance is similar to that observed in models that use much more sophisticated information such as the three-dimensional structure of proteins. Three models that, in addition to the amino acid composition, use different assumptions about the amino acids tendency to be exposed to the solvent, were evaluated in 12 proteins with known experimental DRT. In all the cases analyzed, the model that obtained the best results was the one based on a linear estimation of the aminoacidic surface composition. The models were adjusted using a collection of 74 vectors of aminoacidic properties plus a set of 6388 vectors derived from these using two mathematical tools: k-means and self-organizing maps (SOM) algorithms. The best vector was generated by the SOM algorithm and was interpreted as a hydrophobicity scale based partly on the tendency of the amino acids to be hidden in proteins. The prediction error (MSE JK) obtained by this model was almost 35% smaller than that obtained by the model that supposes that all the amino acids are completely exposed and 40% smaller than that obtained by the model that uses a simple correction factor considering the general tendency of each amino acid to be exposed to the solvent. In fact, the performance of the best model based on the aminoacidic composition was 5% better than that observed in the model based on the three-dimensional structure of proteins.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.