Abstract

The first stable version of the Proteomics Standards Initiative mzIdentML open data standard (version 1.1) was published in 2012—capturing the outputs of peptide and protein identification software. In the intervening years, the standard has become well-supported in both commercial and open software, as well as a submission and download format for public repositories. Here we report a new release of mzIdentML (version 1.2) that is required to keep pace with emerging practice in proteome informatics. New features have been added to support: (1) scores associated with localization of modifications on peptides; (2) statistics performed at the level of peptides; (3) identification of cross-linked peptides; and (4) support for proteogenomics approaches. In addition, there is now improved support for the encoding of de novo sequencing of peptides, spectral library searches, and protein inference. As a key point, the underlying XML schema has only undergone very minor modifications to simplify as much as possible the transition from version 1.1 to version 1.2 for implementers, but there have been several notable updates to the format specification, implementation guidelines, controlled vocabularies and validation software. mzIdentML 1.2 can be described as backwards compatible, in that reading software designed for mzIdentML 1.1 should function in most cases without adaptation. We anticipate that these developments will provide a continued stable base for software teams working to implement the standard. All the related documentation is accessible at http://www.psidev.info/mzidentml.

Highlights

  • The Proteomics Standards Initiative (PSI)1 has taken the role of developing standard file formats for different aspects of mass spectrometry (MS) based analysis (for a review see [1])

  • To create a valid mzIdentML file requires it to be syntactically correct and semantically correct, and these features have been implemented in validation software [9]

  • This mandatory requirement is met by adding an additional controlled vocabulary (CV) term in the ϽSpectrumIdentificationProtocolϾ element depending on the type of workflow represented (Table II)

Read more

Summary

Technological Innovation and Resources

The mzIdentML Data Standard Version 1.2, Supporting Advances in Proteome Informatics*□S. The first stable version of the Proteomics Standards Initiative mzIdentML open data standard (version 1.1) was published in 2012— capturing the outputs of peptide and protein identification software. The Proteomics Standards Initiative (PSI) has taken the role of developing standard file formats for different aspects of mass spectrometry (MS) based analysis (for a review see [1]) These include the mzML format, which can store raw MS data suitable for quantitation processes, as well as processed peak lists for searching [2]. Visualization software for mzIdentML files is available, most notably the open source PRIDE Inspector tool [22], which was updated in 2016 to fully support mzIdentML, and the ProteoIDViewer [9] Some of these tools are reusing open source libraries tailored to the format such as jmzIdentML [23], mzid Library [9], and the ms-data-core-api [24]. Toolbox for protein inference and identification analysis; it supports mzIdentML 1.1

ProteinLynx Global Server
PRIDE Inspector
EXPERIMENTAL PROCEDURES
No special processing
RESULTS AND DISCUSSION
CONCLUSIONS
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call