Abstract

This paper introduces a comparative analysis of the proficiencies of various textures and geometric features in the diagnosis of breast masses on mammograms. An improved machine learning-based framework was developed for this study. The proposed system was tested using 106 full field digital mammography images from the INbreast dataset, containing a total of 115 breast mass lesions. The proficiencies of individual and various combinations of computed textures and geometric features were investigated by evaluating their contributions towards attaining higher classification accuracies. Four state-of-the-art filter-based feature selection algorithms (Relief-F, Pearson correlation coefficient, neighborhood component analysis, and term variance) were employed to select the top 20 most discriminative features. The Relief-F algorithm outperformed other feature selection algorithms in terms of classification results by reporting 85.2% accuracy, 82.0% sensitivity, and 88.0% specificity. A set of nine most discriminative features were then selected, out of the earlier mentioned 20 features obtained using Relief-F, as a result of further simulations. The classification performances of six state-of-the-art machine learning classifiers, namely k-nearest neighbor (k-NN), support vector machine, decision tree, Naive Bayes, random forest, and ensemble tree, were investigated, and the obtained results revealed that the best classification results (accuracy = 90.4%, sensitivity = 92.0%, specificity = 88.0%) were obtained for the k-NN classifier with the number of neighbors having k = 5 and squared inverse distance weight. The key findings include the identification of the nine most discriminative features, that is, FD26 (Fourier Descriptor), Euler number, solidity, mean, FD14, FD13, periodicity, skewness, and contrast out of a pool of 125 texture and geometric features. The proposed results revealed that the selected nine features can be used for the classification of breast masses in mammograms.

Highlights

  • Breast cancer continues to be one of the deadliest diseases

  • Since the main objective of this study is to investigate the effectiveness of various textures and geometric features in the classification of breast masses, no emphasis is placed on segmentation techniques in this study, and pixel-level ground truth annotations provided with the INbreast dataset have been used for the extraction of the exact shape of the mass lesions

  • The k-nearest neighbor (k-NN) classifier was used for the evaluation of classification performance by incorporating tenfold cross-validation, and the performance was observed in the form of mean accuracy, sensitivity, and specificity obtained after ten repetitions

Read more

Summary

Introduction

Breast cancer continues to be one of the deadliest diseases. It is caused by the invasion of abnormal cells across the usual boundaries due to uncontrolled growth and division [1]. The success and widespread adoption of mammography has drastically increased the workload of radiologists Due to this increased workload, even expert radiologists can miss a considerable number of abnormalities or can misinterpret abnormalities that may increase the number of false-positive and false-negative reports. To resolve these issues, computer-aided diagnosis (CAD) systems are used by radiologists as secondary readers [5]. As a result of this, CAD systems for breast masses are attracting considerable research interest

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call