In this study, we propose a semi-supervised learning scheme using a patch-based deep learning framework to tackle the challenge of high-precision classification of seven lung tumor growth patterns, despite having a small amount of labeled data in whole slide images (WSIs). This scheme aims to enhance generalization ability with limited data and reduce dependence on large amounts of labeled data. It effectively addresses the common challenge of high demand for labeled data in medical image analysis.

Approach. To address these challenges, the study employs a semi-supervised learning approach enhanced by a dynamic confidence threshold mechanism. This mechanism adjusts based on the quantity and quality of pseudo labels generated. This dynamic thresholding mechanism helps avoid the imbalance of pseudo-label categories and the low number of pseudo-labels that may result from a higher fixed threshold. Furthermore, the research introduces a multi-teacher knowledge distillation technique. This technique adaptively weights predictions from multiple teacher models to transfer reliable knowledge and safeguard student models from low-quality teacher predictions.

Main results. The framework underwent rigorous training and evaluation using a dataset of 150 WSIs, each representing one of the seven growth patterns. The experimental results demonstrate that the framework is highly accurate in classifying lung tumor growth patterns in histopathology images. Notably, the performance of the framework is comparable to that of fully supervised models and human pathologists. In addition, the framework's evaluation metrics on a publicly available dataset are higher than those of previous studies, indicating good generalizability.

Significance. This research demonstrates that a semi-supervised learning approach can achieve results comparable to fully supervised models and expert pathologists, thus opening new possibilities for efficient and cost-effective medical images analysis. The implementation of dynamic confidence thresholding and multi-teacher knowledge distillation techniques represents a significant advancement in applying deep learning to complex medical image analysis tasks. This advancement could lead to faster and more accurate diagnoses, ultimately improving patient outcomes and fostering the overall progress of healthcare technology.
.
Read full abstract