Abstract

Machine learning techniques have the potential to revolutionise medical diagnosis. Single Nucleotide Polymorphisms (SNPs) are one of the most important sources of human genome variability; thus, they have been implicated in several human diseases. To separate the affected samples from the normal ones, various techniques have been applied on SNPs. Achieving high classification accuracy in such a high-dimensional space is crucial for successful diagnosis and treatment. In this work, we propose an accurate hybrid feature selection method for detecting the most informative SNPs and selecting an optimal SNP subset. The proposed method is based on the fusion of a filter and a wrapper method, i.e. the Conditional Mutual Information Maximization (CMIM) method and the Support Vector Machine Recursive Feature Elimination (SVM-RFE) respectively. The performance of the proposed method was evaluated against three state-of-the-art feature selection methods; Minimum Redundancy Maximum Relevancy (mRMR), CMIM and ReliefF, using four classifiers, Support Vector Machine (SVM), Naive Bayes (NB), Linear Discriminant Analysis (LDA) and k Nearest Neighbors (k-NN) on Autism Spectrum Disorder(ASD) SNP dataset obtained from the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) genomics data repository. The experimental results demonstrate the efficiency of the adopted feature selection approach outperforming all of the compared feature selection algorithms and achieving up to 89% classification accuracy for the used dataset.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.