Abstract

Canine mammary carcinoma (CMC) has been used as a model to investigate the pathogenesis of human breast cancer and the same grading scheme is commonly used to assess tumor malignancy in both. One key component of this grading scheme is the density of mitotic figures (MF). Current publicly available datasets on human breast cancer only provide annotations for small subsets of whole slide images (WSIs). We present a novel dataset of 21 WSIs of CMC completely annotated for MF. For this, a pathologist screened all WSIs for potential MF and structures with a similar appearance. A second expert blindly assigned labels, and for non-matching labels, a third expert assigned the final labels. Additionally, we used machine learning to identify previously undetected MF. Finally, we performed representation learning and two-dimensional projection to further increase the consistency of the annotations. Our dataset consists of 13,907 MF and 36,379 hard negatives. We achieved a mean F1-score of 0.791 on the test set and of up to 0.696 on a human breast cancer dataset.

Highlights

  • Background & SummaryHistologic assessment of tissue is the gold standard in tumor diagnosis and prognostication and is a key component in the selection of the best suited therapy

  • We present a novel, large-scale dataset of canine mammary carcinoma, providing annotations for 21 complete whole slide images of hematoxylin and eosin (H&E)-stained tissue

  • The number of mitotic figures was increased by 6.06% compared to the Manually expert labeled (MEL) dataset, which is in line with previous findings[12]

Read more

Summary

Background & Summary

Histologic assessment of tissue is the gold standard in tumor diagnosis and prognostication and is a key component in the selection of the best suited therapy. The location within the tumor is specified less precisely, but many grading schemes suggest an area at the periphery of the tumor, where the tumor cells are assumed to have greater capacity for proliferation (invasion front) This underlying assumption, has not yet been shown to be generally true in mammary carcinoma to the best of the authors’ knowledge. Www.nature.com/scientificdata available for the complete whole slide image (WSI) - or, better: for several WSIs, ideally representing the complete tumor Neither of these processes can be performed manually by a pathologist within the scope of clinical practice, algorithmic support for pathologists by means of a decision support system would be beneficial. We present a novel, large-scale dataset of canine mammary carcinoma, providing annotations for 21 complete whole slide images of H&E-stained tissue. We tested the pipelines trained with our dataset on the largest available human mammary carcinoma dataset (TUPAC1610)

Methods
Findings
Code availability
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call