Abstract

Extracellular matrix (ECM) proteins play a major role in the tissues of multicellular organisms. The ECM presents structural support for cells inside a tumor. Meanwhile, it also works homeostatically to mediate the interaction between cells. However, the current bioinformatics tools to predict the ECM proteins seem often fail. This paper introduces a method for predicting the ECM proteins from the protein sequence as well as the molecular characteristics. We report a novel hybrid animal migration optimization and random forest method to predict the ECM protein sequences adapting four various features design methods. Binary animal migration optimization (AMORF) is used to select a near-optimal subset of informative features that are most relevant for the classification. AMORF experiments on a data set, including 145 ECM and 3887 non-ECM proteins. Our algorithm performs 86.4700% accuracy, a sensitive of 84.9655%, a specificity of 86.5261%, a Matthew’s correlation coefficient of 0.3627, and an area under receiver operating characteristic of 0.877804. The results confirm that the proposed method is promising. From the results, we can summarize that it can choose small subsets of features and still increase the classification efficiency.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.