This work has the objective of developing a methodology for the classification of regions of interest (ROI) extracted from mammograms into masses and non-masses. To this end, we conducted a comparative study by combining several techniques in the various stages of the problem. We compared several texture-based techniques for the classification of ROIs. This study combines image processing, phylogenetic trees (PT), local binary patterns (LBP), support vector machines (SVM) and the micro-genetic algorithm (GA). The analysis of texture is performed, either through the combination PT/LBP or with grey levels, to compute the taxonomic diversity () and taxonomic distinction (*) indexes extracted from sub-regions (circular, circular crown, internal mask, external mask and the combination of internal and external mask) of an ROI. A GA is used to estimate the best phylogenetic weights. Then, its results are compared with the results achieved with the use of PT only. We also analyse the behaviour of the methodology when using the ROIs with and without enhancement. This enhancement consists of the application of a mean filter and the contrast-limited adaptive histogram equalisation (CLAHE). PTs and SVM were used to perform the selection of features. To evaluate the performance of the methodologies under analysis, we used the following metrics: sensitivity, specificity, accuracy and area under the receiver operating characteristic (ROC) curve (). Sensitivity and specificity measure the efficiency of the classifier at the correct detection of positive (masses) and negative (non-masses) cases, respectively. Accuracy measures the performance of the classification in both cases. The ROC curve is the graphical representation of the pairs (1-specificity, sensitivity). is the area formed by the ROC curve, which equals to 1 in an ideal test. The comparison of the possible combinations for each stage of the study revealed the following results. In the analyses without feature selection, the best results were (1) 100% accuracy and of 0.99 for the combination of PT, LBP and internal masks and (2) 99.5% accuracy and of 0.99 for the combination of GA, LBP and internal masks for the extraction of features. In the analyses with feature selection, the best results were (1) 100% accuracy and of 1.0 for the combination of GA and LBP (feature extraction) with the union of internal and external masks and (2) 98.5% accuracy and of 0.99 for the combination of PT, LBP and the union of internal and external masks.
Read full abstract