Expert Labels Research Articles

Abstract Background and Aims Machine learning (ML) holds great promise for improving diagnostics, prognostication and theranostics in nephropathology. So far, applications have not gone much further than segmentation of tissue compartments on whole slide images (WSIs) of paraffin sections. As a proof-of-concept study, we describe the development of a diagnostic classifier for glomerulephritis based on expert-annotated or automatically segmented glomerular transections from periodic-acid Schiff (PAS) paraffin sections only. Method A total of n = 350 biopsies from 5 institutions with 12 classes of glomerulonephritis IgA nephropathy (IgAN), membranous nephropathy (Membranous), anti-glomerular basement membrane antibody GN (ABMGN), infection-associated GN (IAGN), ANCA-associated GN (ANCA-GN), idiopathic membranoproliferative GN (MPGN), SLE GN class IV (SLE-GN-IV), cryglobulinemic GN (CryoGN), C3 GN (C3-GN), dense deposit disease (DDD), fibrillary GN (FibrillaryGN) and proliferative GN with monoclonal immunoglobulin deposits (PGNMID) were included in the study with their respective PAS sections. Glomerular transections were expert-annotated by a nephropathologist and automatically segmented with our own transformer-based segmentation model trained on 100 biopsies with thrombotic microangiopathies and a range of vascular, vasculitic and glomerular diseases closely resembling/mimicking thrombotic microangiopathies. For classification, we divided the cohort into 5 folds for internal cross-validation, performed sample size augmentation with various methods (including shifts in resolution/scale, AutoAugment and others) and trained our proprietary self-attention-based MILx architecture on an EfficientNet backbone with selection of glomerular crop batches by soft Markov chain Monte Carlo sampling in a semi-supervised fashion, with diagnostic class labels for each biopsy. We compared the performance of our proprietary architecture on both expert-annotated and automatically segmented glomerular crops with a recently published benchmark architecture (CLAM) for multiple-instance learning in histopathology. Results Automatic glomerular segmentation performance was excellent with mean AUC and sensitivity (mean average recall) over all classes at 0.904, with near perfect mean average specificity (0.994), as expected best for Membranous, worst for ABMGN. Classification performance of MILx with expert-annotated glomerular crops as inputs had a mean balanced accuracy of 0.84, with AUC metrics in descending order of 0.97 for Membranous, 0.89 for ABMGN, 0.88 for IgAN, 0.86 for Fibrillary, 0.83 for MPGN, 0.80 for ANCA-GN, 0.79 for DDD, 0.78 for PGNMID, 0.75 for IAGN, 0.73 for SLE-GN-IV and CryoGN, 0.67 for C3-GN. Performance with MILx was similar for automatically segmented glomerular crops as input. On this dataset, MILx outperformed CLAM with both entire WSIs as well as expert-annotated glomerular crops as inputs (mean balanced accuracy of 0.72) by a significant margin. Conclusion This proof-of-concept-study indicates that nephropathology-specific architectures like our MILx can be trained for complex tasks on relatively small biopsy cohorts. We should be able to deliver an end-to-end-pipeline for this diagnostic and other tasks based on training sets with case-labels provided by trusted institutions with only minimal expert labeling or annotation required. PAC and HQ contributed equally to this work.

Read full abstract

Dermoscopy is commonly used for the evaluation of pigmented lesions, but agreement between experts for identification of dermoscopic structures is known to be relatively poor. Expert labeling of medical data is a bottleneck in the development of machine learning (ML) tools, and crowdsourcing has been demonstrated as a cost- and time-efficient method for the annotation of medical images. The aim of this study is to demonstrate that crowdsourcing can be used to label basic dermoscopic structures from images of pigmented lesions with similar reliability to a group of experts. First, we obtained labels of 248 images of melanocytic lesions with 31 dermoscopic "subfeatures" labeled by 20 dermoscopy experts. These were then collapsed into 6 dermoscopic "superfeatures" based on structural similarity, due to low interrater reliability (IRR): dots, globules, lines, network structures, regression structures, and vessels. These images were then used as the gold standard for the crowd study. The commercial platform DiagnosUs was used to obtain annotations from a nonexpert crowd for the presence or absence of the 6 superfeatures in each of the 248 images. We replicated this methodology with a group of 7 dermatologists to allow direct comparison with the nonexpert crowd. The Cohen κ value was used to measure agreement across raters. In total, we obtained 139,731 ratings of the 6 dermoscopic superfeatures from the crowd. There was relatively lower agreement for the identification of dots and globules (the median κ values were 0.526 and 0.395, respectively), whereas network structures and vessels showed the highest agreement (the median κ values were 0.581 and 0.798, respectively). This pattern was also seen among the expert raters, who had median κ values of 0.483 and 0.517 for dots and globules, respectively, and 0.758 and 0.790 for network structures and vessels. The median κ values between nonexperts and thresholded average-expert readers were 0.709 for dots, 0.719 for globules, 0.714 for lines, 0.838 for network structures, 0.818 for regression structures, and 0.728 for vessels. This study confirmed that IRR for different dermoscopic features varied among a group of experts; a similar pattern was observed in a nonexpert crowd. There was good or excellent agreement for each of the 6 superfeatures between the crowd and the experts, highlighting the similar reliability of the crowd for labeling dermoscopic images. This confirms the feasibility and dependability of using crowdsourcing as a scalable solution to annotate large sets of dermoscopic images, with several potential clinical and educational applications, including the development of novel, explainable ML tools.

Read full abstract

Expert Labels Research Articles

Related Topics

Articles published on Expert Labels

Editors’ Choice—AutoEIS: Automated Bayesian Model Selection and Analysis for Electrochemical Impedance Spectroscopy

Automated sleep classification with chronic neural implants in freely behaving canines

Robust detection of marine life with label-free image feature learning and probability calibration

Accelerating voxelwise annotation of cross-sectional imaging through AI collaborative labeling with quality assurance and bias mitigation.

The Climate Change Crisis: A Review of Its Causes and Possible Responses

Learning fast and fine-grained detection of amyloid neuropathologies from coarse-grained expert labels

Rapidly adaptable automated interpretation of point-of-care COVID-19 diagnostics

#6541 GLOMERULONEPHRITIS DIAGNOSIS BY MACHINE LEARNING ON PERIODIC ACID-SCHIFF (PAS) WHOLE SLIDE IMAGES

Multiple instance learning based classification of diabetic retinopathy in weakly-labeled widefield OCTA en face images

The influence of regional tourism economy development on carbon neutrality for environmental protection using improved recurrent neural network

Fusing information entropy and similarity: A novel active learning strategy for chemical process fault classifications

ENRICHing medical imaging training sets enables more efficient machine learning.

Developing a machine learning model to detect diagnostic uncertainty in clinical documentation.

Monitoring of Pigmented Skin Lesions Using 3D Whole Body Imaging

Finding the semantic similarity in single-particle diffraction images using self-supervised contrastive projection learning

Concept evolution detection based on noise reduction soft boundary

Exploiting Superpixel-Based Contextual Information on Active Learning for High Spatial Resolution Remote Sensing Image Classification

Contrastive learning for unsupervised medical image clustering and reconstruction

Label fusion and training methods for reliable representation of inter-rater uncertainty

Agreement Between Experts and an Untrained Crowd for Identifying Dermoscopic Features Using a Gamified App: Reader Feasibility Study.

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Expert Labels Research Articles

Related Topics

Articles published on Expert Labels

Editors’ Choice—AutoEIS: Automated Bayesian Model Selection and Analysis for Electrochemical Impedance Spectroscopy

Automated sleep classification with chronic neural implants in freely behaving canines

Robust detection of marine life with label-free image feature learning and probability calibration

Accelerating voxelwise annotation of cross-sectional imaging through AI collaborative labeling with quality assurance and bias mitigation.

The Climate Change Crisis: A Review of Its Causes and Possible Responses

Learning fast and fine-grained detection of amyloid neuropathologies from coarse-grained expert labels

Rapidly adaptable automated interpretation of point-of-care COVID-19 diagnostics

#6541 GLOMERULONEPHRITIS DIAGNOSIS BY MACHINE LEARNING ON PERIODIC ACID-SCHIFF (PAS) WHOLE SLIDE IMAGES

Multiple instance learning based classification of diabetic retinopathy in weakly-labeled widefield OCTA en face images

The influence of regional tourism economy development on carbon neutrality for environmental protection using improved recurrent neural network

Fusing information entropy and similarity: A novel active learning strategy for chemical process fault classifications

ENRICHing medical imaging training sets enables more efficient machine learning.

Developing a machine learning model to detect diagnostic uncertainty in clinical documentation.

Monitoring of Pigmented Skin Lesions Using 3D Whole Body Imaging

Finding the semantic similarity in single-particle diffraction images using self-supervised contrastive projection learning

Concept evolution detection based on noise reduction soft boundary

Exploiting Superpixel-Based Contextual Information on Active Learning for High Spatial Resolution Remote Sensing Image Classification

Contrastive learning for unsupervised medical image clustering and reconstruction

Label fusion and training methods for reliable representation of inter-rater uncertainty

Agreement Between Experts and an Untrained Crowd for Identifying Dermoscopic Features Using a Gamified App: Reader Feasibility Study.