Abstract
PURPOSEMachine Learning Package for Cancer Diagnosis (MLCD) is the result of a National Institutes of Health/National Cancer Institute (NIH/NCI)-sponsored project for developing a unified software package from state-of-the-art breast cancer biopsy diagnosis and machine learning algorithms that can improve the quality of both clinical practice and ongoing research.METHODSWhole-slide images of 240 well-characterized breast biopsy cases, initially assembled under R01 CA140560, were used for developing the algorithms and training the machine learning models. This software package is based on the methodology developed and published under our recent NIH/NCI-sponsored research grant (R01 CA172343) for finding regions of interest (ROIs) in whole-slide breast biopsy images, for segmenting ROIs into histopathologic tissue types and for using this segmentation in classifiers that can suggest final diagnoses.RESULTThe package provides an ROI detector for whole-slide images and modules for semantic segmentation into tissue classes and diagnostic classification into 4 classes (benign, atypia, ductal carcinoma in situ, invasive cancer) of the ROIs. It is available through the GitHub repository under the Massachusetts Institute of Technology license and will later be distributed with the Pathology Image Informatics Platform system. A Web page provides instructions for use.CONCLUSIONOur tools have the potential to provide help to other cancer researchers and, ultimately, to practicing physicians and will motivate future research in this field. This article describes the methodology behind the software development and gives sample outputs to guide those interested in using this package.
Highlights
The long-term goal of the National Institutes of Health/ National Institute of Cancer (NIH/NIC)–sponsored project, “A Unified Machine Learning Package for Cancer Diagnosis” (U01CA231782), which is part of the Information Technology for Cancer Research (ITCR) program, was the development of a unified software package for the diagnosis of cancer from whole-slide biopsy images
Whole-slide images of 240 well-characterized breast biopsy cases, initially assembled under R01 CA140560, were used for developing the algorithms and training the machine learning models. This software package is based on the methodology developed and published under our recent National Institutes of Health/ National Cancer Institute (NIH/NCI)-sponsored research grant (R01 CA172343) for finding regions of interest (ROIs) in whole-slide breast biopsy images, for segmenting ROIs into histopathologic tissue types and for using this segmentation in classifiers that can suggest final diagnoses
NIH/NCIsponsored research grants, including our own (R01 CA140560; R01 CA172343) have produced uniquely well-characterized biopsy images and methodology for finding regions of interest (ROIs),[2] segmenting them into histopathologic tissue types,[3] and using this segmentation as input to classifiers that suggest diagnoses.[4]. These methods are being converted into a unified Python software package, Machine Learning Package for Cancer Diagnosis (MLCD), with the corresponding modules available through the GitHub repository under the Massachusetts Institute of Technology license, to be distributed later with the Pathology Image Informatics Platform (PIIP) system developed by Martel et al.[5]
Summary
Our CNN model was trained on size 384 × 384 patches from 58 ROIs fully annotated by an experienced pathologist Applying this trained CNN classifier on any given ROI yields a labeled image where the tissue types are represented by different integer labels and can be visualized in different colors: 0, Background (white); 2, Benign Epithelium (magenta); 3, Malignant Epithelium (blue); 4, Normal Stroma (pink); 5, Desmoplastic Stroma (violet); 6, Secretion (green); 7, Blood (yellow); 8, Necrosis (red). Benign epi Malign. epi Normal str Desmo. str Secretion Necrosis Blood ROI detection Correct Incorrect
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.