Abstract

Quality control is increasingly recognized as a crucial aspect of mass spectrometry based proteomics. Several recent papers discuss relevant parameters for quality control and present applications to extract these from the instrumental raw data. What has been missing, however, is a standard data exchange format for reporting these performance metrics. We therefore developed the qcML format, an XML-based standard that follows the design principles of the related mzML, mzIdentML, mzQuantML, and TraML standards from the HUPO-PSI (Proteomics Standards Initiative). In addition to the XML format, we also provide tools for the calculation of a wide range of quality metrics as well as a database format and interconversion tools, so that existing LIMS systems can easily add relational storage of the quality control data to their existing schema. We here describe the qcML specification, along with possible use cases and an illustrative example of the subsequent analysis possibilities. All information about qcML is available at http://code.google.com/p/qcml.

Highlights

  • Despite the abovementioned sets of metrics and corresponding software availability, two issues still prevent quality control data to take its central role in the annotation of proteomics results

  • Storing and communicating this new type of information is currently not standardized, limiting the dissemination of quality control data along with experimental data. It needs to be taken into account that the data can be generated by software tools of different origins, with content and definitions of the performance metrics varying for each tool. To unify both storage and communication of this quality control information, as well as integration in existing workflows, we propose the qcML format

  • The expressive file format and database structure defined by the qcML specification allows a wide range of possibilities in dealing with quality control data in a standardized way

Read more

Summary

Introduction

Storing and communicating this new type of information is currently not standardized, limiting the dissemination of quality control data along with experimental data It needs to be taken into account that the data can be generated by software tools of different origins, with content and definitions of the performance metrics varying for each tool. Like qcML, these standards are all based upon the eXtensible Markup Language (XML), allowing for complex hierarchical data structures to be stored while maintaining human readability All these formats allow the extensible inclusion of extensive metadata by using terms from centrally managed, structured controlled vocabularies (CVs), which can describe both experimental as well as programmatic environmental variables (20)

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.