Abstract

We developed an ensemble classifier for the task of computer-aided diagnosis of breast microcalcification clusters,which are very challenging to characterize for radiologists and computer models alike. The purpose of this study is to help radiologists identify whether suspicious calcification clusters are benign vs. malignant, such that they may potentially recommend fewer unnecessary biopsies for actually benign lesions. The data consists of mammographic features extracted by automated image processing algorithms as well as manually interpreted by radiologists according to a standardized lexicon. We used 292 cases from a publicly available mammography database. From each cases, we extracted 22 image processing features pertaining to lesion morphology, 5 radiologist features also pertaining to morphology, and the patient age. Linear discriminant analysis (LDA) models were designed using each of the three data types. Each local model performed poorly; the best was one based upon image processing features which yielded ROC area index A<sub>Z</sub> of 0.59 &plusmn; 0.03 and partial A<sub>Z</sub> above 90% sensitivity of 0.08 &plusmn; 0.03. We then developed ensemble models using different combinations of those data types, and these models all improved performance compared to the local models. The final ensemble model was based upon 5 features selected by stepwise LDA from all 28 available features. This ensemble performed with A<sub>Z</sub> of 0.69 &plusmn; 0.03 and partial A<sub>Z</sub> of 0.21 &plusmn; 0.04, which was statistically significantly better than the model based on the image processing features alone (p&lt;0.001 and p=0.01 for full and partial A<sub>Z</sub> respectively). This demonstrated the value of the radiologist-extracted features as a source of information for this task. It also suggested there is potential for improved performance using this ensemble classifier approach to combine different sources of currently available data.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call