Two-Stage Feature Selection Method Created for 20 Neurons Artificial Neural Networks for Automatic Breast Cancer Detection

Jalpa J Patel,Sarman K Hadia

doi:10.48048/tis.2023.4027

Abstract

Breast cancer is a common deadly diseases in women. Initial recognition of breast cancer using mammogram images is a challenging task. Hence, this paper proposed a unique automatic diagnosis model for breast cancer. Initially, the mammogram images are preprocessed with a median filter and contrast limited adaptive histogram equalization (CLAHE). The pre-processed image is automatically segmented using the multilevel threshold method. Subsequently, statistical, texture, shape, and geometric features are extracted from the segmented image. So, the length of the feature vector is high, and it is important to identify optimum features. In this paper, the dimension of the feature vector is reduced by 2-stage feature selection methods. Initially, the feature vector is applied to the best first search method information gain (IG) with rank feature method, and then secondly, apply the Pearson correlation method (PCM). Artificial neural networks (ANNs) are used to increase the classification accuracy of a breast cancer diagnosis. In this model, the selection of appropriate neurons in a single hidden layer is used to avoid overfitting problems in an ANN model. Based on optimum feature selection, the appropriate number of neurons chosen in the hidden layer is 20, which was applied for the proposed IG+PCM+Boosted-ANN model. The proposed model is applied on 2 regular datasets mini-Mammographic Image Analysis Society (mini-MIAS) and Digital Database for Screening Mammography (DDSM). The proposed model was superior to other exiting models and the model in this study achieves the accuracy of 99 and 98.80 % for mini-MIAS and DDSM datasets, respectively.

Full Text