Abstract
Data processing in proteomics can be a challenging endeavor, requiring extensive knowledge of many different software packages, all with different algorithms, data format requirements, and user interfaces. In this article we describe the integration of a number of existing programs and tools in Taverna Workbench, a scientific workflow manager currently being developed in the bioinformatics community. We demonstrate how a workflow manager provides a single, visually clear and intuitive interface to complex data analysis tasks in proteomics, from raw mass spectrometry data to protein identifications and beyond.
Highlights
Data analysis in mass spectrometry and proteomics is inherently a multistep process, typically involving several of the following steps: m/z calibration, chromatographic time alignment, intensity normalization, compound identification by database search or de novo interpretation, quantitation, addressing the protein inference problem, merging and/or comparing data from different samples or time points, multivariate statistics, and mapping of data to Gene Ontology annotations or biological networks
Inputs are processed in consecutive steps by processing units, whereby data flows from one processing unit to another until all processing units have finished their task and the output is finalized
The second workflow aligns and combines an LC-MS/MS data set with an accurate mass
Summary
Data processing in proteomics can be a challenging endeavor, requiring extensive knowledge of many different software packages, all with different algorithms, data format requirements, and user interfaces. Scientific workflow managers have become popular in bioinformatics as they are well-suited for assembling different specialized software modules or scripts into an overall data flow, typically a directed acyclic graph, taking the data through consecutive steps of analysis. Workflow managers such as Kepler [1, 2] and Taverna Workbench [3, 4] provide visualization and a graphical user interface for designing and executing analytical process flows. We will describe a number of ways in which existing and novel proteomics analysis scenarios can be implemented in a scientific workflow manager
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have