Abstract
The PRoteomics IDEntifications (PRIDE) database is a large public proteomics data repository, containing over 270 million mass spectra (by November 2011). PRIDE is an archival database, providing the proteomics data supporting specific scientific publications in a computationally accessible manner. While PRIDE faces rapid increases in data deposition size as well as number of depositions, the major challenge is to ensure a high quality of data depositions in the context of highly diverse proteomics work flows and data representations. Here, we describe the PRIDE curation pipeline and its practical application in quality control of complex data depositions.Database URL: http://www.ebi.ac.uk/pride/.
Highlights
Proteomics can be defined as ‘the study of the subsets of proteins present in different parts of the organism and how they change with time and varying conditions’ (1)
The situation is already improving significantly as a result of the Human Proteome Organization Proteomics Standards Initiative (PSI) developing the standard formats mzML (2) and mzIdentML (3), which are becoming increasingly implemented by instrument and search engine producers
While generation and public availability of proteomics data are still, several orders of magnitude smaller than e.g. genomics data, both quantity and complexity of proteomics data sets deposited in the PRoteomics IDEntifications (PRIDE) database are rapidly increasing
Summary
Attila Csordas*, David Ovelleiro, Rui Wang, Joseph M. The PRoteomics IDEntifications (PRIDE) database is a large public proteomics data repository, containing over 270 million mass spectra (by November 2011). PRIDE is an archival database, providing the proteomics data supporting specific scientific publications in a computationally accessible manner. While PRIDE faces rapid increases in data deposition size as well as number of depositions, the major challenge is to ensure a high quality of data depositions in the context of highly diverse proteomics work flows and data representations. We describe the PRIDE curation pipeline and its practical application in quality control of complex data depositions.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have