Biological pathway data integration trends, techniques, issues and challenges: A survey

Shubhalaxmi Kher,Neelam Rawat,Julie A Dickerson

doi:10.1109/nabic.2010.5716330

Abstract

One of the major challenges of the modern bioinformatics research is to integrate biological pathway data to understand the inner working of the cell. Various pathway data sources are often structured differently and employ algorithms for analysis and integration. Each has a specific motivation for integration that may be suitable only for a particular type of pathway like metabolic pathway or protein-protein interactions. Additionally, with the documentation associated with biological pathway data sources, one needs to understand the database schemas used to store data in each source system, and translate among the schemas in order to exchange information between them. The authenticity of a data source may be subjective as many of them are not independent but derived and data sources often contain similar or overlapping data elements but use conflicting data definitions. There is often a need for user-friendly tools and interfaces to transform bioinformatics data from one database schema to another to discover correlated data among many databases, regardless of the structure of the databases. Most importantly, there are no standards set up for developing biological pathway source and integration. The integration mechanisms may not register important metadata like, copies of input files and time of integration along with the integrated output file. This paper reviews recent developments in biological pathway and sequence data integration and discusses the trends, techniques, issues, and challenges.

Full Text