Abstract
BackgroundIdentifying corresponding features (LC peaks registered by identical peptides) in multiple Liquid Chromatography/Mass Spectrometry (LC-MS) datasets plays a crucial role in the analysis of complex peptide or protein mixtures. Warping functions are commonly used to correct the mean of elution time shifts among LC-MS datasets, which cannot resolve the ambiguity of corresponding feature identification since elution time shifts are random. We propose a Statistical Corresponding Feature Identification Algorithm(SCFIA) based on both elution time shifts and peak shape correlations between corresponding features. SCFIA first trains a set of statistical models, and then, all candidate corresponding features are scored by the statistical models to find the maximum likelihood solution.ResultsWe test SCFIA on publicly available datasets. We first compare its performance with that of warping function based methods, and the results show significant improvements. The performance of SCFIA on replicates datasets and fractionated datasets is also evaluated. In both cases, the accuracy is above 90%, which is near optimal. Finally the coverage of SCFIA is evaluated, and it is shown that SCFIA can find corresponding features in multiple datasets for over 90% peptides identified by Tandem MS.ConclusionsSCFIA can be used for accurate corresponding feature identification in LC-MS. We have shown that peak shape correlation can be used effectively for improving the accuracy. SCFIA provides high coverage in corresponding feature identification in multiple datasets, which serves the basis for integrating multiple LC-MS measurements for accurate peptide quantification.
Highlights
Identifying corresponding features (LC peaks registered by identical peptides) in multiple Liquid Chromatography/Mass Spectrometry (LC-MS) datasets plays a crucial role in the analysis of complex peptide or protein mixtures
One important task in LC-MS/MS processing is the identification of corresponding features in multiple datasets, which is critical for the integration of quantification information to reduce measurement variation [2]
To address the proposed problem, we develop a Statistical Corresponding Feature Identification Algorithm (SCFIA) which identifies corresponding features based on matching elution times and elution peak shapes
Summary
Identifying corresponding features (LC peaks registered by identical peptides) in multiple Liquid Chromatography/Mass Spectrometry (LC-MS) datasets plays a crucial role in the analysis of complex peptide or protein mixtures. One important task in LC-MS/MS processing is the identification of corresponding features (peaks registered by identical peptides) in multiple datasets, which is critical for the integration of quantification information to reduce measurement variation [2]. If a peptide is picked up by Tandem MS, its LC elution peak can be located exactly in LCMS We refer to such LC peaks as “features with identity”. If a peptide is not picked up by Tandem MS, its elution peak location would be unknown, and its LC peak is called “a feature with unknown identity”
Published Version (
Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have