Minimal Annotation Research Articles

A known issue that hinders the development of deep learning models is the need for accurate annotation of a large quantity of samples – a time-consuming, labor-intensive, and error-prone task. This limitation is particularly critical in areas where data annotation requires expert knowledge. Semi-supervised learning methods, such as pseudo-labeling, can alleviate the problem by capitalizing on both limited labeled and plentiful unlabeled data; nonetheless, state-of-the-art methods often require pre-trained encoders and validation sets to deliver effective solutions. Herein, we introduce a teacher-student-based iterative meta-pseudo-labeling approach, named consensus Deep Feature Annotation (cons-DeepFA), that enables the training of custom Convolutional Neural Networks (CNNs) from small quantities of labeled samples without reliance on pre-trained encoders and validation sets. cons-DeepFA explores Feature Learning from Image Markers (FLIM) to initialize the filters of a target CNN (student) from minimal data annotation – i.e., user-drawn markers on discriminative regions of a few selected images per class. During each of a few iterations, the latent space of the student's last dense layer is non-linearly projected onto a two-dimensional space for downstream label propagation via an optimum-connectivity-based approach (teacher); afterward, the student is re-trained using pseudo-labeled samples selected by the proposed consensus mechanism, which jointly improves the latent space, its projection, and the student's generalization ability as iterations progress. This strategy was recently introduced with pre-trained encoders by selecting the most confident pseudo-labeled samples to re-train the student. While building on previous methods, cons-DeepFA presents two key contributions. It (i) incorporates FLIM to enable training a custom CNN from scratch with faster convergence, improving its generalization ability, and (ii) introduces a consensus-based procedure over multiple iterations that selects more accurately pseudo-labeled samples for re-training the CNN. Lastly, cons-DeepFA is evaluated on five challenging biological image datasets, demonstrating its effectiveness and competitiveness when compared to seven state-of-the-art methods from four semi-supervised learning paradigms.

Read full abstract

Abstract Background and Aims Machine learning (ML) holds great promise for improving diagnostics, prognostication and theranostics in nephropathology. So far, applications have not gone much further than segmentation of tissue compartments on whole slide images (WSIs) of paraffin sections. As a proof-of-concept study, we describe the development of a diagnostic classifier for glomerulephritis based on expert-annotated or automatically segmented glomerular transections from periodic-acid Schiff (PAS) paraffin sections only. Method A total of n = 350 biopsies from 5 institutions with 12 classes of glomerulonephritis IgA nephropathy (IgAN), membranous nephropathy (Membranous), anti-glomerular basement membrane antibody GN (ABMGN), infection-associated GN (IAGN), ANCA-associated GN (ANCA-GN), idiopathic membranoproliferative GN (MPGN), SLE GN class IV (SLE-GN-IV), cryglobulinemic GN (CryoGN), C3 GN (C3-GN), dense deposit disease (DDD), fibrillary GN (FibrillaryGN) and proliferative GN with monoclonal immunoglobulin deposits (PGNMID) were included in the study with their respective PAS sections. Glomerular transections were expert-annotated by a nephropathologist and automatically segmented with our own transformer-based segmentation model trained on 100 biopsies with thrombotic microangiopathies and a range of vascular, vasculitic and glomerular diseases closely resembling/mimicking thrombotic microangiopathies. For classification, we divided the cohort into 5 folds for internal cross-validation, performed sample size augmentation with various methods (including shifts in resolution/scale, AutoAugment and others) and trained our proprietary self-attention-based MILx architecture on an EfficientNet backbone with selection of glomerular crop batches by soft Markov chain Monte Carlo sampling in a semi-supervised fashion, with diagnostic class labels for each biopsy. We compared the performance of our proprietary architecture on both expert-annotated and automatically segmented glomerular crops with a recently published benchmark architecture (CLAM) for multiple-instance learning in histopathology. Results Automatic glomerular segmentation performance was excellent with mean AUC and sensitivity (mean average recall) over all classes at 0.904, with near perfect mean average specificity (0.994), as expected best for Membranous, worst for ABMGN. Classification performance of MILx with expert-annotated glomerular crops as inputs had a mean balanced accuracy of 0.84, with AUC metrics in descending order of 0.97 for Membranous, 0.89 for ABMGN, 0.88 for IgAN, 0.86 for Fibrillary, 0.83 for MPGN, 0.80 for ANCA-GN, 0.79 for DDD, 0.78 for PGNMID, 0.75 for IAGN, 0.73 for SLE-GN-IV and CryoGN, 0.67 for C3-GN. Performance with MILx was similar for automatically segmented glomerular crops as input. On this dataset, MILx outperformed CLAM with both entire WSIs as well as expert-annotated glomerular crops as inputs (mean balanced accuracy of 0.72) by a significant margin. Conclusion This proof-of-concept-study indicates that nephropathology-specific architectures like our MILx can be trained for complex tasks on relatively small biopsy cohorts. We should be able to deliver an end-to-end-pipeline for this diagnostic and other tasks based on training sets with case-labels provided by trusted institutions with only minimal expert labeling or annotation required. PAC and HQ contributed equally to this work.

Read full abstract

Minimal Annotation Research Articles

Related Topics

Articles published on Minimal Annotation

Consensus-based iterative meta-pseudo-labeling for deep semi-supervised learning

VGTS: Visually Guided Text Spotting for novel categories in historical manuscripts

Enhancing MAUDE Database Utility by GPT-4 and Cause-Effect Visualization.

Low-resource entity resolution with domain generalization and active learning

Efficient Wheat Head Segmentation with Minimal Annotation: A Generative Approach.

Semi-Self-Supervised Domain Adaptation: Developing Deep Learning Models with Limited Annotated Data for Wheat Head Segmentation

Active Dynamic Weighting for multi-domain adaptation

GPT for medical entity recognition in Spanish

Structuring medication signeturs as a language regression task: comparison of zero- and few-shot GPT with fine-tuned models.

Multicentric intelligent cardiotocography signal interpretation using deep semi-supervised domain adaptation via minimax entropy and domain invariance

227 Spine-tuned Natural Language Models and Bespoke Regular Expression Classifiers for Automated Spinal Surgery Registry Development

Magnifying Networks for Histopathological Images with Billions of Pixels.

Self-Supervised Interactive Embedding for One-shot Organ Segmentation.

Weakly-Interactive-Mixed Learning: Less Labelling Cost for Better Medical Image Segmentation.

#6541 GLOMERULONEPHRITIS DIAGNOSIS BY MACHINE LEARNING ON PERIODIC ACID-SCHIFF (PAS) WHOLE SLIDE IMAGES

A unified microstructure segmentation approach via human-in-the-loop machine learning

Understudied Proteins and Understudied Functions in the Model Bacterium Bacillus subtilis - a Major Challenge in Current Research.

Semantic segmentation of water bodies in very high-resolution satellite and aerial images

Self-supervised learning based transformer and convolution hybrid network for one-shot organ segmentation

Motion-region annotation for complex videos via label propagation across occluders

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Minimal Annotation Research Articles

Related Topics

Articles published on Minimal Annotation

Consensus-based iterative meta-pseudo-labeling for deep semi-supervised learning

VGTS: Visually Guided Text Spotting for novel categories in historical manuscripts

Enhancing MAUDE Database Utility by GPT-4 and Cause-Effect Visualization.

Low-resource entity resolution with domain generalization and active learning

Efficient Wheat Head Segmentation with Minimal Annotation: A Generative Approach.

Semi-Self-Supervised Domain Adaptation: Developing Deep Learning Models with Limited Annotated Data for Wheat Head Segmentation

Active Dynamic Weighting for multi-domain adaptation

GPT for medical entity recognition in Spanish

Structuring medication signeturs as a language regression task: comparison of zero- and few-shot GPT with fine-tuned models.

Multicentric intelligent cardiotocography signal interpretation using deep semi-supervised domain adaptation via minimax entropy and domain invariance

227 Spine-tuned Natural Language Models and Bespoke Regular Expression Classifiers for Automated Spinal Surgery Registry Development

Magnifying Networks for Histopathological Images with Billions of Pixels.

Self-Supervised Interactive Embedding for One-shot Organ Segmentation.

Weakly-Interactive-Mixed Learning: Less Labelling Cost for Better Medical Image Segmentation.

#6541 GLOMERULONEPHRITIS DIAGNOSIS BY MACHINE LEARNING ON PERIODIC ACID-SCHIFF (PAS) WHOLE SLIDE IMAGES

A unified microstructure segmentation approach via human-in-the-loop machine learning

Understudied Proteins and Understudied Functions in the Model Bacterium Bacillus subtilis - a Major Challenge in Current Research.

Semantic segmentation of water bodies in very high-resolution satellite and aerial images

Self-supervised learning based transformer and convolution hybrid network for one-shot organ segmentation

Motion-region annotation for complex videos via label propagation across occluders