Abstract

Heat shock proteins (HSPs) from different families and sub-types play a vital role in the folding and unfolding of proteins, in maintaining cellular health, and in preventing serious disorders. Previous computational methods for HSP classification have yielded promising performance. However, most of the existing methods rely heavily on amino acid composition features and still face challenges related to interpretability and accuracy. To overcome these issues, we introduce a novel frequent sequential pattern (FSP)-based analysis and classification method for the classification of HSPs, their families, and sub-types. The proposed method is called FSP4HSP, which stands for “FSP for HSP”. It identifies FSPs of amino acids (FSPAAs) and utilizes them for analysis and classification. Besides FSPAAs, sequential rules among amino acids are also discovered. Both binary and multi-class classification scenarios are considered, with the utilization of eight integer-based and four string-based classifiers. The incorporation of FSPAAs in the classification/prediction task enhances the interpretability of FSP4HSP and a comprehensive performance comparison using various evaluation measures demonstrates that it surpasses existing methods for the classification/recognition of HSPs.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.