MCAM: Multiple Clustering Analysis Methodology for Deriving Hypotheses and Insights from High-Throughput Proteomic Datasets

Kristen M Naegle,Michael B Yaffe,Forest M White,Roy E Welsch,Douglas A Lauffenburger

doi:10.1371/journal.pcbi.1002119

Kristen M Naegle, Michael B Yaffe + Show 3 more

Open Access

https://doi.org/10.1371/journal.pcbi.1002119

Copy DOI

Abstract

Advances in proteomic technologies continue to substantially accelerate capability for generating experimental data on protein levels, states, and activities in biological samples. For example, studies on receptor tyrosine kinase signaling networks can now capture the phosphorylation state of hundreds to thousands of proteins across multiple conditions. However, little is known about the function of many of these protein modifications, or the enzymes responsible for modifying them. To address this challenge, we have developed an approach that enhances the power of clustering techniques to infer functional and regulatory meaning of protein states in cell signaling networks. We have created a new computational framework for applying clustering to biological data in order to overcome the typical dependence on specific a priori assumptions and expert knowledge concerning the technical aspects of clustering. Multiple clustering analysis methodology (‘MCAM’) employs an array of diverse data transformations, distance metrics, set sizes, and clustering algorithms, in a combinatorial fashion, to create a suite of clustering sets. These sets are then evaluated based on their ability to produce biological insights through statistical enrichment of metadata relating to knowledge concerning protein functions, kinase substrates, and sequence motifs. We applied MCAM to a set of dynamic phosphorylation measurements of the ERRB network to explore the relationships between algorithmic parameters and the biological meaning that could be inferred and report on interesting biological predictions. Further, we applied MCAM to multiple phosphoproteomic datasets for the ERBB network, which allowed us to compare independent and incomplete overlapping measurements of phosphorylation sites in the network. We report specific and global differences of the ERBB network stimulated with different ligands and with changes in HER2 expression. Overall, we offer MCAM as a broadly-applicable approach for analysis of proteomic data which may help increase the current understanding of molecular networks in a variety of biological problems.

Highlights

Large and complex high-throughput proteomic experimental studies are becoming more accessible through the use of powerful, swiftly developing platforms such as mass spectrometry (MS), flow cytometry (FC), and various kinds of protein microarrays (PMA) [1,2,3]
As one particular example of increasing attention, there has been an explosion in large-scale datasets for receptor tyrosine kinase (RTK) network signaling by the combination of protein post-translational modification enrichment followed by quantitative MS methods [4]
Multiple clustering analysis methodology (MCAM) was developed to capitalize on the success unsupervised learning has had on biological inference in the past and apply it to a new challenge in the field, that of understanding the function and regulation of phosphorylation in the ERBB network

Summary

Introduction

Large and complex high-throughput proteomic experimental studies are becoming more accessible through the use of powerful, swiftly developing platforms such as mass spectrometry (MS), flow cytometry (FC), and various kinds of protein microarrays (PMA) [1,2,3]. As one particular example of increasing attention, there has been an explosion in large-scale datasets for receptor tyrosine kinase (RTK) network signaling by the combination of protein post-translational modification enrichment followed by quantitative MS methods [4]. In receptor tyrosine kinase (RTK) networks, such as those activated by the ERBB family of receptors, phosphorylation plays a central role in the translation of extracellular cues into phenotypic changes, such as differentiation, proliferation, and migration [5]. Mass spectrometry measurement of phosphorylation events in cellular signaling networks is greatly increasing our understanding of the specific modifications occurring in the cell as well as their relative changes in response to network perturbations, such as ligand stimulation or kinase inhibition. Unsupervised computational learning methods, applied to quantitative phosphoproteomic data, provides one method by Author Summary

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: PLoS Computational Biology	Publication Date: Jul 21, 2011
Citations: 50	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

MCAM: Multiple Clustering Analysis Methodology for Deriving Hypotheses and Insights from High-Throughput Proteomic Datasets

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLoS Computational Biology

Lead the way for us

Similar Papers

Abstract 5226: Potential of PET imaging to predict the response to anti-HER2 therapy in breast cancer
Gabriela Kramer-Marek ... Anthony Kong
Cancer Research | VOL. 70
Gabriela Kramer-Marek, et. al.Gabriela Kramer-Marek ... Anthony Kong
15 Apr 2010
Abstract 5226: Potential of PET imaging to predict the response to anti-HER2 therapy in breast cancer
Gabriela Kramer-Marek ... Anthony Kong

Development and preclinical studies of 64Cu-NOTA-pertuzumab F(ab′)2 for imaging changes in tumor HER2 expression associated with response to trastuzumab by PET/CT
Karen Lam ... Raymond M Reilly
mAbs | VOL. 9
Karen Lam, et. al.Karen Lam ... Raymond M Reilly
04 Nov 2016
mAbs | VOL. 9

P1-12-18: Change in HER2 Status in HER2 Positive Operable Breast Cancer Patients Treated with Neoadjuvant Chemotherapy with or without Anti-HER2 Therapy: Analysis of Two Consecutive Cohorts.
E Barbieri ... G Ficarra
Cancer Research | VOL. 71
E Barbieri, et. al.E Barbieri ... G Ficarra
15 Dec 2011
Cancer Research | VOL. 71

Abstract PS11-06: Analysis of HER2 expression changes from breast primary to brain metastases including HER2 Low and impact on overall survival
Alyssa Pereslete ... Sarah Sammons
Cancer Research | VOL. 84
Alyssa Pereslete, et. al.Alyssa Pereslete ... Sarah Sammons
02 May 2024
Cancer Research | VOL. 84

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

MCAM: Multiple Clustering Analysis Methodology for Deriving Hypotheses and Insights from High-Throughput Proteomic Datasets

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLoS Computational Biology