Abstract

In the past few decades, the field of bioinformatics has accumulated a large amount of gene expression data which provided important support fur the diagnosis of disease. However, high dimensionality, small sample sizes, and redundant features often adversely affect the accuracy and the speed of prediction. Existing feature selection models cannot obtain the information of these datasets accurately. Filter and wrapper are two commonly used feature selection methods. Combining the advantages of the fast calculation speed of the filter and the high accuracy of the wrapper, a new hybrid algorithm called MIIBGSA, is proposed, which hybridizes mutual information and improved Gravitational Search Algorithm (GSA). First, mutual information is used to rank and select important features, these features are further chosen into the population of the wrapper method. Then, due to the effectiveness of the GSA algorithm, GSA is adopted to further seek an optimal feature subset. However, GSA also has the disadvantages of slow search speed and premature convergence, which limit its optimization ability. In our work, a scale function is added to the speed update to enhance its search ability, and an adaptive kre,t particle update formula is proposed to improve its convergence accuracy and propose a fitness sharing strategy to enhance the randomness of particle populations and searchability through the niche algorithm of fitness sharing. We used 10fold-CV method with the K? N classifier to evaluate the classification accuracy. Experimental results on five publicly available high-dimensional biomedical data sets show that the proposed NH-LBGSA has superior performance than other algorithms.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.