Cataract surgery, a widely performed operation worldwide, is incorporating semantic segmentation to advance computer-assisted intervention. However, the tissue appearance and illumination in cataract surgery often differ among clinical centers, intensifying the issue of domain shifts. While domain adaptation offers remedies to the shifts, the necessity for data centralization raises additional privacy concerns. To overcome these challenges, we propose a Multi-view Test-time Adaptation algorithm (MUTA) to segment cataract surgical scenes, which leverages multi-view learning to enhance model training within the source domain and model adaptation within the target domain. In the training phase, the segmentation model is equipped with multi-view decoders to boost its robustness against variations in cataract surgery. During the inference phase, test-time adaptation is implemented using multi-view knowledge distillation, enabling model updates in clinics without data centralization or privacy concerns. We conducted experiments in a simulated cross-center scenario using several cataract surgery datasets to evaluate the effectiveness of MUTA. Through comparisons and investigations, we have validated that MUTA effectively learns a robust source model and adapts the model to target data during the practical inference phase. Code and datasets are available at https://github.com/liamheng/CAI-algorithms.
Read full abstract