On the Impact of Granularity in Extracting Knowledge from Bioinformatics Data

Hesham Ali,Sean West

doi:10.5220/0005778700920103

Abstract

With the rapidly increasing amount of various types of biological data currently available to researchers, the focus of the biomedical research community has been shifting from pure data generation towards the development of new methodologies for data analytics. Although many researchers continue to focus on approaches developed for analyzing single types of biological data, recent attempts have been made to utilize the availability of heterogeneous data sets that contain various types of data and try to establish tools for data integration and analysis in many bioinformatics applications. Such attempts are expected to increase significantly in this coming decade. While this can be viewed as a positive step towards advancing big data analytics in bioinformatics, it is critical that these integration methodologies are meticulously studied to ensure high quality of the knowledge extracted from the integrated data. In this work, we employ data integration methods to analyze biological data obtained from protein interaction networks and gene expression data. We conduct a study to show that potential problems can arise from integrating or fusing data obtained at different granularity levels and highlight the importance of developing advanced data fusing techniques to integrate various types of biological data for analytical purposes. Further, we explore the impact of granularity from a more formulized approach and the granularity levels significantly impact the quality of knowledge extracted from the integrated data.

Full Text