Abstract

MotivationWhile deep-learning algorithms have demonstrated outstanding performance in semantic image segmentation tasks, large annotation datasets are needed to create accurate models. Annotation of histology images is challenging due to the effort and experience required to carefully delineate tissue structures, and difficulties related to sharing and markup of whole-slide images.ResultsWe recruited 25 participants, ranging in experience from senior pathologists to medical students, to delineate tissue regions in 151 breast cancer slides using the Digital Slide Archive. Inter-participant discordance was systematically evaluated, revealing low discordance for tumor and stroma, and higher discordance for more subjectively defined or rare tissue classes. Feedback provided by senior participants enabled the generation and curation of 20 000+ annotated tissue regions. Fully convolutional networks trained using these annotations were highly accurate (mean AUC=0.945), and the scale of annotation data provided notable improvements in image classification accuracy.Availability and ImplementationDataset is freely available at: https://goo.gl/cNM4EL.Supplementary information Supplementary data are available at Bioinformatics online.

Highlights

  • Accurate segmentation of tissue regions in histology images is a challenging problem with important applications in computational pathology

  • The dataset used in this study consists of 151 hematoxylin and eosin stained whole-slide images (WSIs) corresponding to 151 histologicallyconfirmed breast cancer cases

  • This data can be visualized in a public instance of the Digital Slide Archive (DSA) at https://goo. gl/cNM4EL

Read more

Summary

Introduction

Accurate segmentation of tissue regions in histology images is a challenging problem with important applications in computational pathology. While deep-learning methods have demonstrated promising results in general semantic image segmentation problems, these encoder-decoder convolutional architectures require large training datasets to generalize well. Generating annotated histology datasets with adequate scale presents significant challenges, especially when careful delineation of regions or structures is required, and the lack of annotated histology remains a significant barrier in the growth of computational pathology. A pathologist with significant clinical demands often cannot produce enough annotations on their own to adequately train deep-learning models for challenging applications like semantic segmentation. Interfaces for viewing and annotating whole-slide histology images, collaborative review and data management are a critical element in engaging pathologists to scale the production of accurate ground truth

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.