Lung needle biopsy image classification is a critical task for computer-aided lung cancer diagnosis. In this study, a novel method, multimodal sparse representation-based classification (mSRC), is proposed for classifying lung needle biopsy images. In the data acquisition procedure of our method, the cell nuclei are automatically segmented from the images captured by needle biopsy specimens. Then, features of three modalities (shape, color, and texture) are extracted from the segmented cell nuclei. After this procedure, mSRC goes through a training phase and a testing phase. In the training phase, three discriminative subdictionaries corresponding to the shape, color, and texture information are jointly learned by a genetic algorithm guided multimodal dictionary learning approach. The dictionary learning aims to select the topmost discriminative samples and encourage large disagreement among different subdictionaries. In the testing phase, when a new image comes, a hierarchical fusion strategy is applied, which first predicts the labels of the cell nuclei by fusing three modalities, then predicts the label of the image by majority voting. Our method is evaluated on a real image set of 4372 cell nuclei regions segmented from 271 images. These cell nuclei regions can be divided into five classes: four cancerous classes (corresponding to four types of lung cancer) plus one normal class (no cancer). The results demonstrate that the multimodal information is important for lung needle biopsy image classification. Moreover, compared to several state-of-the-art methods (LapRLS, MCMI-AB, mcSVM, ESRC, KSRC), the proposed mSRC can achieve significant improvement (mean accuracy of 88.1%, precision of 85.2%, recall of 92.8%, etc.), especially for classifying different cancerous types.
Read full abstract