Local-to-global spatial learning for whole-slide image representation and classification.

Jiahui Yu,Cheng Zhuo,Tianyu Ma,Yu Fu,Yingke Xu,Hang Chen,Maode Lai

doi:10.1016/j.compmedimag.2023.102230

Abstract

Whole-slide image (WSI) provides an important reference for clinical diagnosis. Classification with only WSI-level labels can be recognized for multi-instance learning (MIL) tasks. However, most existing MIL-based WSI classification methods have moderate performance on correlation mining between instances limited by their instance- level classification strategy. Herein, we propose a novel local-to-global spatial learning method to mine global position and local morphological information by redefining the MIL-based WSI classification strategy, better at learning WSI-level representation, called Global-Local Attentional Multi-Instance Learning (GLAMIL). GLAMIL can focus on regional relationships rather than single instances. It first learns relationships between patches in the local pool to aggregate region correlation (tissue types of a WSI). These correlations then can be further mined to fulfill WSI-level representation, where position correlation between different regions can be modeled. Furthermore, Transformer layers are employed to model global and local spatial information rather than being simply used as feature extractors, and the corresponding structure improvements are present. In addition, we evaluate GIAMIL on three benchmarks considering various challenging factors and achieve satisfactory results. GLAMIL outperforms state-of-the-art methods and baselines by about 1 % and 10 %, respectively.

Full Text