Abstract

Protein subcellular localization has been systematically characterized in budding yeast using fluorescently tagged proteins. Based on the fluorescence microscopy images, subcellular localization of many proteins can be classified automatically using supervised machine learning approaches that have been trained to recognize predefined image classes based on statistical features. Here, we present an unsupervised analysis of protein expression patterns in a set of high-resolution, high-throughput microscope images. Our analysis is based on 7 biologically interpretable features which are evaluated on automatically identified cells, and whose cell-stage dependency is captured by a continuous model for cell growth. We show that it is possible to identify most previously identified localization patterns in a cluster analysis based on these features and that similarities between the inferred expression patterns contain more information about protein function than can be explained by a previous manual categorization of subcellular localization. Furthermore, the inferred cell-stage associated to each fluorescence measurement allows us to visualize large groups of proteins entering the bud at specific stages of bud growth. These correspond to proteins localized to organelles, revealing that the organelles must be entering the bud in a stereotypical order. We also identify and organize a smaller group of proteins that show subtle differences in the way they move around the bud during growth. Our results suggest that biologically interpretable features based on explicit models of cell morphology will yield unprecedented power for pattern discovery in high-resolution, high-throughput microscopy images.

Highlights

  • High-content screening of fluorescently tagged proteins has been widely applied to systematically characterize subcellular localizations of proteins in a variety of settings [1]

  • A collection of *4000 yeast strains was introduced, where in each strain a single protein was tagged with green fluorescent protein (GFP)

  • We show that by training a computer to accurately identify the buds of growing yeast cells, and making simple fluorescence measurements in context of cell shape and cell stage, the computer could automatically discover most of the localization patterns without any prior knowledge of what the patterns might be

Read more

Summary

Introduction

High-content screening of fluorescently tagged proteins has been widely applied to systematically characterize subcellular localizations of proteins in a variety of settings [1]. Unsupervised analysis has the advantage that it is unbiased by prior ‘expert’ knowledge, such as the arbitrary discretization of protein expression patterns into recognizable classes For these reasons, unsupervised cluster analysis has become a vital tool of computational biology through its application to genome-wide mRNA expression measurements [4,5,6,7], and protein-protein interaction data [8]. It has been applied in automated microscopy image analysis [9,10,11,12,13] where it has been shown to provide complementary capabilities to supervised approaches

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call