Abstract

Since the introduction of the concept of the “proteome” 10 years ago, the volume of data collected during a proteomic experiment has dramatically increased. Proteomic experimental methods have expanded to include multidimensional protein separation, evaluation by mass spectrometry, protein identification through tandem mass spectrometry, and quantitation through multiple reaction monitoring. In order to characterize these results, diverse bioinformatics tools have been developed. In the broadest sense, the goal of these tools is to alleviate the bottleneck resulting from the volume of data collected. Bioinformatics has provided significant contributions to the area of proteomics. One area that has contributed significantly to the analysis of proteomic data is the development of common data formats for proteomic mass spectrometric data, such as mzXML and mzData, and the recent release of mzML as a joint format [1]. With the availability of such common data formats, open-source tools have become not only available, but useful to a wide audience. Availability and utility of data repositories has provided greater accessibility to data during both publication review as well as providing additional data for comparison in subsequent data analysis. The value of these repositories will only increase as additional data is deposited over time. While bioinformatics tools have contributed to the common research goals, several significant hurdles remain in proteomics where bioinformatics could offer help. As an example, even with the recent advances in sample processing and instrumentation technologies, reproducibility of experimental data remains an issue that hinders meaningful interpretation of the results. Bioinformatics tools that incorporate knowledge of the physical process of proteomic data collection to improve protein identification, quantitation, and data normalization would be very valuable in studies involving differential analysis of protein expressions. Tools that assess quality of proteomic data based on sound statistical principles are also needed. Such assessment will allow us draw conclusions from proteomic studies, from simple fold-changes to networks and models, with clearly stated limitations in terms of statistical confidence. In the current special issue, we selected several papers with novel approaches which could result in improvement in reproducibility of proteomics results. Topics covered include improvements in quality assessment of tandem mass spectrometry, combining search results to improve the reliability of the predicted identities, and steps to improve the reliability of MRM results. We hope these papers will serve as examples to provide insights into how proteomic analysis pipelines can be improved in the near future. Quality assessment of tandem mass spectrometry has been evaluated previously, but the methods have focused primarily on supervised machine-learning approaches. The limitations to these approaches are twofold: (1) they require manually validated data to develop a classifier and (2) it is difficult to create a classifier which is robust enough to perform reasonably, given the inherently inconsistencies observed when tandem mass spectrometers are utilized under different conditions. Furthermore, the tandem mass Clin Proteom (2009) 5:1–2 DOI 10.1007/s12014-009-9026-3

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.