Prior Class Probabilities Research Articles

One of the goals of AI-based computational pathology is to generate compact representations of whole slide images (WSIs) that capture the essential information needed for diagnosis. While such approaches have been applied to histopathology, few applications have been reported in cytology. Bone marrow aspirate cytology is the basis for key clinical decisions in hematology. However, visual inspection of aspirate specimens is a tedious and complex process subject to variation in interpretation, and hematopathology expertise is scarce. The ability to generate a compact representation of an aspirate specimen may form the basis for clinical decision-support tools in hematology. In this study, we leverage our previously published end-to-end AI-based system for counting and classifying cells from bone marrow aspirate WSIs, which enables the direct use of individual cells as inputs rather than WSI patches. We then construct bags of individual cell features from each WSI, and apply multiple instance learning to extract their vector representations. To evaluate the quality of our representations, we conducted WSI retrieval and classification tasks. Our results show that we achieved a mAP@10 of 0.58 ±0.02 in WSI-level image retrieval, surpassing the random-retrieval baseline of 0.39 ±0.1. Furthermore, we predicted five diagnostic labels for individual aspirate WSIs with a weighted-average F1 score of 0.57 ±0.03 using a k-nearest-neighbors (k-NN) model, outperforming guessing using empirical class prior probabilities (0.26 ±0.02). We present the first example of exploring trainable mechanisms to generate compact, slide-level representations in bone marrow cytology with deep learning. This method has the potential to summarize complex semantic information in WSIs toward improved diagnostics in hematology, and may eventually support AI-assisted computational pathology approaches.

Read full abstract

We critically re-examine the Saerens-Latinne-Decaestecker (SLD) algorithm, a well-known method for estimating class prior probabilities (“priors”) and adjusting posterior probabilities (“posteriors”) in scenarios characterized by distribution shift, i.e., difference in the distribution of the priors between the training and the unlabelled documents. Given a machine learned classifier and a set of unlabelled documents for which the classifier has returned posterior probabilities and estimates of the prior probabilities, SLD updates them both in an iterative, mutually recursive way, with the goal of making both more accurate; this is of key importance in downstream tasks such as single-label multiclass classification and cost-sensitive text classification. Since its publication, SLD has become the standard algorithm for improving the quality of the posteriors in the presence of distribution shift, and SLD is still considered a top contender when we need to estimate the priors (a task that has become known as “quantification”). However, its real effectiveness in improving the quality of the posteriors has been questioned. We here present the results of systematic experiments conducted on a large, publicly available dataset, across multiple amounts of distribution shift and multiple learners. Our experiments show that SLD improves the quality of the posterior probabilities and of the estimates of the prior probabilities, but only when the number of classes in the classification scheme is very small and the classifier is calibrated. As the number of classes grows, or as we use non-calibrated classifiers, SLD converges more slowly (and often does not converge at all), performance degrades rapidly, and the impact of SLD on the quality of the prior estimates and of the posteriors becomes negative rather than positive.

Read full abstract

Prior Class Probabilities Research Articles

Related Topics

Articles published on Prior Class Probabilities

A Comparative Study of Landslide Susceptibility Mapping Using Bagging PU Learning in Class-Prior Probability Shift Datasets

Expand and Shrink: Federated Learning with Unlabeled Data Using Clustering

Whole slide image representation in bone marrow cytology

Threshold optimization and random undersampling for imbalanced credit card data

Factorizable Joint Shift in Multinomial Classification

Multi-instance positive and unlabeled learning with bi-level embedding

From Human Explanation to Model Interpretability: A Framework Based on Weight of Evidence

Estimating the class prior for positive and unlabelled data via logistic regression

A Critical Reassessment of the Saerens-Latinne-Decaestecker Algorithm for Posterior Probability Adjustment

Information-Theoretic Representation Learning for Positive-Unlabeled Classification.

Incorporating spatial association into statistical classifiers: local pattern-based prior tuning

Cost-sensitive support vector machines

A personalized mail re-filtering system based on the client

Image Classification by Integrating Reject Option and Prior Information

A systematic study of the class imbalance problem in convolutional neural networks

Optimizing community-level surveillance data for pediatric asthma management.

Investigation on Land Cover Mapping Capability of Maximum Likelihood Classifier: A Case Study on North Canara, India

An empirical study on the effect of imbalanced data on bleeding detection in endoscopic video.

Extended least squares support vector machines for ordinal regression

Subspace Learning via Local Probability Distribution for Hyperspectral Image Classification

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Prior Class Probabilities Research Articles

Related Topics

Articles published on Prior Class Probabilities

A Comparative Study of Landslide Susceptibility Mapping Using Bagging PU Learning in Class-Prior Probability Shift Datasets

Expand and Shrink: Federated Learning with Unlabeled Data Using Clustering

Whole slide image representation in bone marrow cytology

Threshold optimization and random undersampling for imbalanced credit card data

Factorizable Joint Shift in Multinomial Classification

Multi-instance positive and unlabeled learning with bi-level embedding

From Human Explanation to Model Interpretability: A Framework Based on Weight of Evidence

Estimating the class prior for positive and unlabelled data via logistic regression

A Critical Reassessment of the Saerens-Latinne-Decaestecker Algorithm for Posterior Probability Adjustment

Information-Theoretic Representation Learning for Positive-Unlabeled Classification.

Incorporating spatial association into statistical classifiers: local pattern-based prior tuning

Cost-sensitive support vector machines

A personalized mail re-filtering system based on the client

Image Classification by Integrating Reject Option and Prior Information

A systematic study of the class imbalance problem in convolutional neural networks

Optimizing community-level surveillance data for pediatric asthma management.

Investigation on Land Cover Mapping Capability of Maximum Likelihood Classifier: A Case Study on North Canara, India

An empirical study on the effect of imbalanced data on bleeding detection in endoscopic video.

Extended least squares support vector machines for ordinal regression

Subspace Learning via Local Probability Distribution for Hyperspectral Image Classification