Abstract Background and Aims Machine learning (ML) holds great promise for improving diagnostics, prognostication and theranostics in nephropathology. So far, applications have not gone much further than segmentation of tissue compartments on whole slide images (WSIs) of paraffin sections. As a proof-of-concept study, we describe the development of a diagnostic classifier for glomerulephritis based on expert-annotated or automatically segmented glomerular transections from periodic-acid Schiff (PAS) paraffin sections only. Method A total of n = 350 biopsies from 5 institutions with 12 classes of glomerulonephritis IgA nephropathy (IgAN), membranous nephropathy (Membranous), anti-glomerular basement membrane antibody GN (ABMGN), infection-associated GN (IAGN), ANCA-associated GN (ANCA-GN), idiopathic membranoproliferative GN (MPGN), SLE GN class IV (SLE-GN-IV), cryglobulinemic GN (CryoGN), C3 GN (C3-GN), dense deposit disease (DDD), fibrillary GN (FibrillaryGN) and proliferative GN with monoclonal immunoglobulin deposits (PGNMID) were included in the study with their respective PAS sections. Glomerular transections were expert-annotated by a nephropathologist and automatically segmented with our own transformer-based segmentation model trained on 100 biopsies with thrombotic microangiopathies and a range of vascular, vasculitic and glomerular diseases closely resembling/mimicking thrombotic microangiopathies. For classification, we divided the cohort into 5 folds for internal cross-validation, performed sample size augmentation with various methods (including shifts in resolution/scale, AutoAugment and others) and trained our proprietary self-attention-based MILx architecture on an EfficientNet backbone with selection of glomerular crop batches by soft Markov chain Monte Carlo sampling in a semi-supervised fashion, with diagnostic class labels for each biopsy. We compared the performance of our proprietary architecture on both expert-annotated and automatically segmented glomerular crops with a recently published benchmark architecture (CLAM) for multiple-instance learning in histopathology. Results Automatic glomerular segmentation performance was excellent with mean AUC and sensitivity (mean average recall) over all classes at 0.904, with near perfect mean average specificity (0.994), as expected best for Membranous, worst for ABMGN. Classification performance of MILx with expert-annotated glomerular crops as inputs had a mean balanced accuracy of 0.84, with AUC metrics in descending order of 0.97 for Membranous, 0.89 for ABMGN, 0.88 for IgAN, 0.86 for Fibrillary, 0.83 for MPGN, 0.80 for ANCA-GN, 0.79 for DDD, 0.78 for PGNMID, 0.75 for IAGN, 0.73 for SLE-GN-IV and CryoGN, 0.67 for C3-GN. Performance with MILx was similar for automatically segmented glomerular crops as input. On this dataset, MILx outperformed CLAM with both entire WSIs as well as expert-annotated glomerular crops as inputs (mean balanced accuracy of 0.72) by a significant margin. Conclusion This proof-of-concept-study indicates that nephropathology-specific architectures like our MILx can be trained for complex tasks on relatively small biopsy cohorts. We should be able to deliver an end-to-end-pipeline for this diagnostic and other tasks based on training sets with case-labels provided by trusted institutions with only minimal expert labeling or annotation required. PAC and HQ contributed equally to this work.
Read full abstract