Abstract

BackgroundImage-based high throughput (HT) screening provides a rich source of information on dynamic cellular response to external perturbations. The large quantity of data generated necessitates computer-aided quality control (QC) methodologies to flag imaging and staining artifacts. Existing image- or patch-level QC methods require separate thresholds to be simultaneously tuned for each image quality metric used, and also struggle to distinguish between artifacts and valid cellular phenotypes. As a result, extensive time and effort must be spent on per-assay QC feature thresholding, and valid images and phenotypes may be discarded while image- and cell-level artifacts go undetected.ResultsWe present a novel cell-level QC workflow built on machine learning approaches for classifying artifacts from HT image data. First, a phenotype sampler based on unlabeled clustering collects a comprehensive subset of cellular phenotypes, requiring only the inspection of a handful of images per phenotype for validity. A set of one-class support vector machines are then trained on each biologically valid image phenotype, and used to classify individual objects in each image as valid cells or artifacts. We apply this workflow to two real-world large-scale HT image datasets and observe that the ratio of artifact to total object area (ARcell) provides a single robust assessment of image quality regardless of the underlying causes of quality issues. Gating on this single intuitive metric, partially contaminated images can be salvaged and highly contaminated images can be excluded before image-level phenotype summary, enabling a more reliable characterization of cellular response dynamics.ConclusionsOur cell-level QC workflow enables identification of artificial cells created not only by staining or imaging artifacts but also by the limitations of image segmentation algorithms. The single readout ARcell that summaries the ratio of artifacts contained in each image can be used to reliably rank images by quality and more accurately determine QC cutoff thresholds. Machine learning-based cellular phenotype clustering and sampling reduces the amount of manual work required for training example collection. Our QC workflow automatically handles assay-specific phenotypic variations and generalizes to different HT image assays.

Highlights

  • Image-based high throughput (HT) screening provides a rich source of information on dynamic cellular response to external perturbations

  • Due to the lack of standardized ground truth image quality scores for HT image data to directly compare against, we demonstrate the benefits of our approach by applying it to two real-world assays, showing how it favorably compares to existing image quality control (QC) methods in a variety of applications

  • We have presented a QC workflow built on two novel insights that gives users more robust and fine-grained quality control capabilities, while avoiding the lengthy and difficult process of simultaneously thresholding multiple QC features for every assay

Read more

Summary

Introduction

Image-based high throughput (HT) screening provides a rich source of information on dynamic cellular response to external perturbations. DNA accumulation in apoptotic cells can lead to more saturated pixels, high protein expression ratios may lead the image intensity into a wider range, and a shift to fewer but larger cells leads to fewer edge pixels and less high frequency components in the image power spectrum. The dependency of such QC metrics on cellular context such as cell counts and morphology can cause images with valid phenotypic variations to be discarded, leading to low true positive rates. These QC metrics may be unaffected by true cell-level artifacts, such as segmentation failures from treated cells adhering together and forming blobs, leading to low true negative rates as well

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call