Abstract
The use of machine learning tools in biological data analysis is increasing gradually. This is mainly because the effectiveness of classification and recognition systems has improved in a great deal to help medical experts in diagnosing. In this paper, we investigate the performance of an artificial immune system based k-nearest neighbors algorithm with and without cross-validation in a class of imbalanced problems from bioinformatics field. Furthermore, we used an unsupervised artificial immune system algorithm for reduction training data dimension and k-nearest neighbors algorithm for classification purpose. The conducted experiments showed the effectiveness of the proposed schema. By selecting the E. coli database, we could compare our classification accuracy with other methods which were presented in the literature. The proposed hybrid system produced much more accurate results than the Horton and Nakai's proposal [P. Horton, K. Nakai, A probabilistic classification system for predicting the cellular localization sites of proteins, in: Proceedings of the 4th International Conference on Intelligent Systems for Molecular Biology, AAAI Press, St. Louis, 1996, pp. 109–115; P. Horton, K. Nakai, Better prediction of protein cellular localization sites with the k-nearest neighbors classifier, in: Proceedings of Intelligent Systems in Molecular Biology, Halkidiki, Greece, 1997, pp. 368–383]. Besides the accuracy improvement, one of the important aspects of the proposed methodology is the complexity. As the artificial immune system provided data reduction, the training complexity of the proposed system is considerably low against the k-nearest neighbors classifier.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.