Abstract LB012: Evaluating computational approaches for CPTAC pan-cancer cross-cohort protein expression comparison

Jixin Wang,Wen Yu,Wenyan Zhong,Elaine Hurt,John Bullen,Xiaowen Tian

doi:10.1158/1538-7445.am2024-lb012

Abstract

Abstract Recently, the National Cancer Institute’s Clinical Proteomic Tumor Analysis Consortium (CPTAC) has generated harmonized genomic, transcriptomic, proteomic, and clinical data for &gt;1000 tumors in 10 cohorts to facilitate pan-cancer discovery research. However, protein expression comparison across CPTAC cohorts remains challenging due to non-uniform missing data and protein expression distribution patterns across tumor types. Here we present our efforts to evaluate various missing data handling and normalization strategies to generate a normalized pan-cancer protein expression dataset. First, we developed a novel algorithm to select robustly expressed proteins in tumors in any of the CPTAC cohorts, Second, we applied a cohort hybrid imputation approach to protein abundance values from FragPipe within each cohort based on protein expression distribution patterns. Third, we calculated iBAQ using protein abundance value and applied global quantile normalization or smooth quantile normalization methods. To assess if our missing data imputation and normalization strategy affected downstream analyses, we compared the fold change in differential protein expression between tumor and matched normal for each cohort using non-normalized, global quantile normalized and smooth quantile normalized protein iBAQ values. Our results demonstrate a strong correlation in fold change between global quantile normalized data and non-normalized data (Pearson r = 0.97 (ccRCC), r = 0.96 (COAD), r = 0.99 (LUAD) and r = 0.99 (LSCC)). Similar results were observed comparing smooth quantile normalized data to non-normalized data (Pearson r = 1.00 for ccRCC, COAD, LUAD, and LSCC), indicating both normalization methods retained biological differences between tumor and matched normal tissues within cohorts. Lastly, we identified several proteins (ERAP2, CA9, GSTM3, MX1, STAT1) whose protein and RNA expression were highly correlated across eight CPTAC cohorts (r &gt; 0.7 for COAD, BRCA, LUAD, ccRCC, PDAC, UCEC, HNSCC, and LSCC). We then compared their protein expression rank across CPTAC cohorts with their RNA expression rank across corresponding TCGA cohorts. Specifically, median log2(iBAQ) of CPTAC and median log2(TPM) of TCGA are calculated for those proteins within indications, then indications are ranked by median log2(iBAQ) and median log2(TPM) in CPTAC and TCGA, respectively. Weighted rank correlation was used to measure rank agreement. Global quantile normalization has the highest rank correlation (weighted rank correlation between 0.597 to 0.931) compared to smooth quantile normalization or without normalization. These results suggest that combination of cohort hybrid imputation and global quantile normalization is a reasonable approach to generate a normalized CPTAC pan-cancer protein dataset that could be leveraged to interrogate protein expression across different cancer types. Citation Format: Jixin Wang, Xiaowen Tian, Wen Yu, John Bullen, Elaine Hurt, Wenyan Zhong. Evaluating computational approaches for CPTAC pan-cancer cross-cohort protein expression comparison [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2024; Part 2 (Late-Breaking, Clinical Trial, and Invited Abstracts); 2024 Apr 5-10; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2024;84(7_Suppl):Abstract nr LB012.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Abstract LB012: Evaluating computational approaches for CPTAC pan-cancer cross-cohort protein expression comparison

Abstract

Talk to us

Similar Papers

More From: Cancer Research

Lead the way for us

Journal: Cancer Research	Publication Date: Apr 5, 2024
Citations: 1

Similar Papers

Abstract 1282: NCI's Clinical Proteomic Tumor Analysis Consortium
Robert C Rivers ... Henry Rodriguez
Cancer Research | VOL. 72
Robert C Rivers, et. al.Robert C Rivers ... Henry Rodriguez
15 Apr 2012
Abstract 1282: NCI's Clinical Proteomic Tumor Analysis Consortium
Robert C Rivers ... Henry Rodriguez

Pan-Cancer Proteomics Analysis to Identify Tumor-Enriched and Highly Expressed Cell Surface Antigens as Potential Targets for Cancer Therapeutics
Jixin Wang ... Wenyan Zhong
Molecular & Cellular Proteomics | VOL. 22
Jixin Wang, et. al.Jixin Wang ... Wenyan Zhong
28 Jul 2023
Molecular & Cellular Proteomics | VOL. 22

Abstract SY44-01: Proteogenomic analysis of human colon and rectal cancer
Daniel C Liebler
Cancer Research | VOL. 75
Daniel C LieblerDaniel C Liebler
01 Aug 2015
Abstract SY44-01: Proteogenomic analysis of human colon and rectal cancer
Daniel C Liebler

Abstract 2129: CPTAC: Biospecimen accrual for proteogenomics
Mathangi Thiagarajan
Cancer Research | VOL. 79
Mathangi ThiagarajanMathangi Thiagarajan
01 Jul 2019
Abstract 2129: CPTAC: Biospecimen accrual for proteogenomics
Mathangi Thiagarajan

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Abstract LB012: Evaluating computational approaches for CPTAC pan-cancer cross-cohort protein expression comparison

Abstract

Talk to us

Similar Papers

More From: Cancer Research