Abstract

Numerous cancer histopathology specimens have been collected and digitized over the past few decades. A comprehensive evaluation of the distribution of various cells in tumor tissue sections can provide valuable information for understanding cancer. Deep learning is suitable for achieving these goals; however, the collection of extensive, unbiased training data is hindered, thus limiting the production of accurate segmentation models. This study presents SegPath-the largest annotation dataset (>10 times larger than publicly available annotations)-for the segmentation of hematoxylin and eosin (H&E)-stained sections for eight major cell types in cancer tissue. The SegPath generating pipeline used H&E-stained sections that were destained and subsequently immunofluorescence-stained with carefully selected antibodies. We found that SegPath is comparable with, or outperforms, pathologist annotations. Moreover, annotations by pathologists are biased toward typical morphologies. However, the model trained on SegPath can overcome this limitation. Our results provide foundational datasets for machine-learning research in histopathology.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call