Abstract

The prediction of microsatellite instability (MSI) using deep learning (DL) techniques could have significant benefits, including reducing cost and increasing MSI testing of colorectal cancer (CRC) patients. Nonetheless, batch effects or systematic biases are not well characterized in digital histology models and lead to overoptimistic estimates of model performance. Methods to not only palliate but to directly abrogate biases are needed. We present a multiple bias rejecting DL system based on adversarial networks for the prediction of MSI in CRC from tissue microarrays (TMAs), trained and validated in 1788 patients from EPICOLON and HGUA. The system consists of an end-to-end image preprocessing module that tile samples at multiple magnifications and a tissue classification module linked to the bias-rejecting MSI predictor. We detected three biases associated with the learned representations of a baseline model: the project of origin of samples, the patient’s spot and the TMA glass where each spot was placed. The system was trained to directly avoid learning the batch effects of those variables. The learned features from the bias-ablated model achieved maximum discriminative power with respect to the task and minimal statistical mean dependence with the biases. The impact of different magnifications, types of tissues and the model performance at tile vs patient level is analyzed. The AUC at tile level, and including all three selected tissues (tumor epithelium, mucin and lymphocytic regions) and 4 magnifications, was 0.87 ± 0.03 and increased to 0.9 ± 0.03 at patient level. To the best of our knowledge, this is the first work that incorporates a multiple bias ablation technique at the DL architecture in digital pathology, and the first using TMAs for the MSI prediction task.

Highlights

  • 3% of colorectal cancers (CRC) arise in the context of Lynch syndrome (LS), where the patient has a germline mutation in a DNA mismatch repair (MMR) gene [1].Historically, CRC patients were tested for LS if they were at high risk according to clinical criteria, e.g., aged under 50 years or with a strong family history

  • The tissue classifier module was used in inference for the pre-selection of the regions of interest restricted to tumor epithelium, lymphocytic infiltrates and mucin totaling

  • For each of the 5 folds, tile predictions of the bias-controlled model were aggregated by majority voting for the decision of the microsatellite instability (MSI) status at the patient level. 5-folds validation sets consisted of approximately 300 patients each and the mean prevalence of microsatellite instability-high (MSI-H) was

Read more

Summary

Introduction

3% of colorectal cancers (CRC) arise in the context of Lynch syndrome (LS), where the patient has a germline mutation in a DNA mismatch repair (MMR) gene [1].Historically, CRC patients were tested for LS if they were at high risk according to clinical criteria, e.g., aged under 50 years or with a strong family history. 3% of colorectal cancers (CRC) arise in the context of Lynch syndrome (LS), where the patient has a germline mutation in a DNA mismatch repair (MMR) gene [1]. Biomolecules 2021, 11, 1786 individuals at risk for Lynch syndrome or eligible for tumor-based MSI testing [2]. Universal tumor-based genetic screening for Lynch syndrome, with MSI screening by PCR or defective MMR(dMMR) detection by IHC testing of all CRCs regardless of age, has greater sensitivity for identification of Lynch syndrome as compared with other strategies [1]. Patients with tumours showing MSH2, MSH6 or isolated PMS2 loss, or MLH1 loss/MSI with no evidence of BRAF mutation/MLH1 promoter hypermethylation, are referred for germline testing if clinically appropriate

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call