Abstract

Imaging flow cytometry (IFC) enables the high throughput collection of morphological and spatial information from hundreds of thousands of single cells. This high content, information rich image data can in theory resolve important biological differences among complex, often heterogeneous biological samples. However, data analysis is often performed in a highly manual and subjective manner using very limited image analysis techniques in combination with conventional flow cytometry gating strategies. This approach is not scalable to the hundreds of available image-based features per cell and thus makes use of only a fraction of the spatial and morphometric information. As a result, the quality, reproducibility and rigour of results are limited by the skill, experience and ingenuity of the data analyst. Here, we describe a pipeline using open-source software that leverages the rich information in digital imagery using machine learning algorithms. Compensated and corrected raw image files (.rif) data files from an imaging flow cytometer (the proprietary .cif file format) are imported into the open-source software CellProfiler, where an image processing pipeline identifies cells and subcellular compartments allowing hundreds of morphological features to be measured. This high-dimensional data can then be analysed using cutting-edge machine learning and clustering approaches using “user-friendly” platforms such as CellProfiler Analyst. Researchers can train an automated cell classifier to recognize different cell types, cell cycle phases, drug treatment/control conditions, etc., using supervised machine learning. This workflow should enable the scientific community to leverage the full analytical power of IFC-derived data sets. It will help to reveal otherwise unappreciated populations of cells based on features that may be hidden to the human eye that include subtle measured differences in label free detection channels such as bright-field and dark-field imagery.

Highlights

  • It is widely accepted that cellular and molecular heterogeneity pervades all biological systems [1,2]

  • To enable the application of advanced high-throughput data analysis to imaging flow cytometry, we developed a new protocol to harvest and analyse the rich information in images acquired via imaging flow cytometers

  • Random Undersampling (RUS) boosting is tailored for highly imbalanced data sets, which may explain the superior prediction in the underrepresented class anaphase; RUS boosting is not currently an option within CellProfiler Analyst

Read more

Summary

Introduction

It is widely accepted that cellular and molecular heterogeneity pervades all biological systems [1,2]. This creates a complex set of challenges for understanding how individual cells within heterogeneous communities interact with one another in order to determine the phenotype and function of higher organisms with. It is a significant challenge to derive meaningful, objective conclusions from the high parameter output inherent to most cytometric approaches.

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call