Model architecture and tile size selection for convolutional neural network training for non-small cell lung cancer detection on whole slide images

Angus Lang Sun Lee,Curtis Chun Kit To,Alfred Lok Hang Lee,Joshua Jing Xi Li,Ronald Cheong Kin Chan

doi:10.1016/j.imu.2022.100850

Angus Lang Sun Lee, Curtis Chun Kit To + Show 3 more

Open Access

https://doi.org/10.1016/j.imu.2022.100850

Copy DOI

Abstract

Recent advancements in Artificial-Intelligence-based computer vision systems demonstrated impressive image and pattern recognition capabilities. A special class of neural networks known as Convolutional Neural Networks are used in a wide variety of computer vision tasks such as image classification, object detection and autonomous driving. There is a huge potential to adopt such technology in the domain of pathology. Image data in pathology are considerably larger in size than in typical image recognition problems. Dissecting the image into smaller bits, known as tiling, are often carried out. This paper aims to compare and contrast common model architectures and input tile sizes systematically and find the optimal configuration in the context of lung cancer classification problems.A dataset composed of 87 annotated whole slide images of lung cancer specimens were collected and annotated by two pathologists. Annotated areas were grouped into four classes (Tumor area, Non-tumor area, Necrosis area, and Immune cells). Annotations were converted into labelled tiles at different tile sizes, from 296 to 10000 pixels, (74–2500 μm). The problem was framed as a supervised 4-class classification problem using deep learning.For each tile size, three models, VGG19, InceptionResnetv2 and EfficientNet b3, were trained. Model performances were measured on holdout dataset using standard quantitative metrics including F1-score and AUC-ROC. Our best model instance with tile size at 500 × 500 pixels (125 × 125 μm) achieved an F1-score at 0.9685 and AUC-ROC score at 0.9627.Our results showed that tile size had a significant impact on model performance. The optimal tile size was between 500 and 1000 pixel (125–250 μm) after both quantitative and qualitative assessments. VGG19 marginally outperformed other model architectures.

Full Text