Abstract

BackgroundThe expansion of automatic imaging technologies has created a need to be able to efficiently compare and review large sets of image data. To enable comparisons of image data between samples we need to define the normal variation within distinct images of the same sample. Even with tightly controlled experimental conditions, protein expression can vary widely between cells, and because of the difficulty in viewing and comparing large image sets this might not be observed. Here we introduce a novel methodology, iCluster, for visualizing, clustering and comparing large sub-cellular localization image sets. For each member of an image set, iCluster generates statistics that have been found to be useful in distinguishing sub-cellular localization. The statistics are mapped into two or three dimensions such as to preserve distances between the statistics vectors. The complete image set is then visualized in two or three dimensions using the coordinates so determined. The result is images that are statistically similar are spatially close in the visualization allowing for easy comparison of images that are similar and distinguishment of dissimilar images into distinct clusters.ResultsThe methodology was tested on a set of 502 previously published images containing 10 known sub-cellular localizations. The clustering of images of like type was evaluated both by examining the classes of nearest neighbors to each image and by visual inspection. In three dimensions, 3-neighbor classification accuracy was 83.2%. Visually, each class clustered well with the majority of classes localizing to distinct regions of the space. In two dimensions, 3-neighbor classification accuracy was 68.9%, though visually clustering into classes could be readily discerned. Computational expense was found to be relatively low, and sets of up to 1400 images visualized and interacted with in real time.ConclusionThe feasibility of automated spatial layout to allow comparison and discrimination of high throughput sub-cellular imaging has been demonstrated. There are many potential applications such as image database curation, semi-automated interactive classification, outlier detection and reference image comparison. By allowing the observation of the full range of imaging data available using modern microscopes these methods will provide an invaluable tool for cell biologists.

Highlights

  • The expansion of automatic imaging technologies has created a need to be able to efficiently compare and review large sets of image data

  • The aim is to map the set of statistics vectors of the images into 2 or 3 dimensions such that the distance between points of the image set are preserved as well

  • VFisguuarliezin2g the image set in 3 dimensions Visualizing the image set in 3 dimensions. (A) The 502 endogenous images visualized in 3 dimensions with coordinates determined from Sammon mapping Haralick and threshold adjacency statistics (TAS) measures. (B) 1407 cell images automatically cropped from the 502 images and visualized in 3 dimensions with coordinates determined from Sammon mapping Haralick and TAS measures

Read more

Summary

Introduction

The expansion of automatic imaging technologies has created a need to be able to efficiently compare and review large sets of image data. High-throughput automated fluorescent microscope imaging technologies enable the experimental determination of a protein's sub-cellular localization and its dynamic trafficking within a range of cellular contexts These approaches generate vast numbers of images including multiple fluorophores for cells under a variety of experimental conditions [3,4]. Statistical measures may be used to generate a numeric vector for each cell image, and have a wide range of applications such as automated sub-cellular localization classification [9,10,11,12], image clustering [13], representative image selection [14,15] and statistical differentiation of protein localization under varying experimental conditions [16]

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call