Abstract
BackgroundSequence comparison is one of the most prominent tools in biological research, and is instrumental in studying gene function and evolution. The rapid development of high-throughput technologies for measuring protein interactions calls for extending this fundamental operation to the level of pathways in protein networks.ResultsWe present a comprehensive framework for protein network searches using pathway queries. Given a linear query pathway and a network of interest, our algorithm, QPath, efficiently searches the network for homologous pathways, allowing both insertions and deletions of proteins in the identified pathways. Matched pathways are automatically scored according to their variation from the query pathway in terms of the protein insertions and deletions they employ, the sequence similarity of their constituent proteins to the query proteins, and the reliability of their constituent interactions. We applied QPath to systematically infer protein pathways in fly using an extensive collection of 271 putative pathways from yeast. QPath identified 69 conserved pathways whose members were both functionally enriched and coherently expressed. The resulting pathways tended to preserve the function of the original query pathways, allowing us to derive a first annotated map of conserved protein pathways in fly.ConclusionPathway homology searches using QPath provide a powerful approach for identifying biologically significant pathways and inferring their function. The growing amounts of protein interactions in public databases underscore the importance of our network querying framework for mining protein network data.
Highlights
Sequence comparison is one of the most prominent tools in biological research, and is instrumental in studying gene function and evolution
TFhigeuQrePa1th algorithmic flow The QPath algorithmic flow. (a) Given a query pathway, a weighted protein-protein interactions (PPIs) network, and sequence similarity scores between the query proteins and the network proteins, the QPath algorithm identifies a set of matching pathways
We applied QPath to analyze the PPI networks of the yeast S. cerevisiae, the fly D. melanogaster, and human, aiming to address two coupled, fundamental questions motivated from sequence analysis: (i) Can pathway homology be used to identify functionally significant pathways? (ii) Can one infer the function of a pathway based on pathway homology information? We provide positive answers to both questions
Summary
Sequence comparison is one of the most prominent tools in biological research, and is instrumental in studying gene function and evolution. Sequence homology searches have been the workhorse of bioinformatics for the past 30 years, providing the means to study the function and evolution of genes and proteins. Studying the function and evolution of protein modules underscores the importance of extending homology search tools from the single gene level to the network level. (a) Given a query pathway, a weighted PPI network, and sequence similarity scores between the query proteins and the network proteins, the QPath algorithm identifies a set of matching pathways. These are scored to capture the tendency of their constituent proteins to have a coherent function. These are scored to capture the tendency of their constituent proteins to have a coherent function. (b) An example of an alignment that induces protein insertions (F') and deletions (C)
Published Version (
Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have