Abstract
BackgroundDevelopment of robust and efficient methods for analyzing and interpreting high dimension gene expression profiles continues to be a focus in computational biology. The accumulated experiment evidence supports the assumption that genes express and perform their functions in modular fashions in cells. Therefore, there is an open space for development of the timely and relevant computational algorithms that use robust functional expression profiles towards precise classification of complex human diseases at the modular level.ResultsInspired by the insight that genes act as a module to carry out a highly integrated cellular function, we thus define a low dimension functional expression profile for data reduction. After annotating each individual gene to functional categories defined in a proper gene function classification system such as Gene Ontology applied in this study, we identify those functional categories enriched with differentially expressed genes. For each functional category or functional module, we compute a summary measure (s) for the raw expression values of the annotated genes to capture the overall activity level of the module. In this way, we can treat the gene expressions within a functional module as an integrative data point to replace the multiple values of individual genes. We compare the classification performance of decision trees based on functional expression profiles with the conventional gene expression profiles using four publicly available datasets, which indicates that precise classification of tumour types and improved interpretation can be achieved with the reduced functional expression profiles.ConclusionThis modular approach is demonstrated to be a powerful alternative approach to analyzing high dimension microarray data and is robust to high measurement noise and intrinsic biological variance inherent in microarray data. Furthermore, efficient integration with current biological knowledge has facilitated the interpretation of the underlying molecular mechanisms for complex human diseases at the modular level.
Highlights
Development of robust and efficient methods for analyzing and interpreting high dimension gene expression profiles continues to be a focus in computational biology
With the rapid accumulation of gene functional knowledge, Gene Ontology (GO) functional modules have been widely applied in inferring the unknown functions of genes based on their expression profiles (e.g. [33,34,35]), but there is an open space for development of the timely and relevant computational algorithms that use robust functional expression profiles towards precise classification of complex human diseases at the modular level
We have proposed an alternative approach to analyzing gene expression profiles at the modular levels, where the functional expression profiles replace the traditional gene expression profiles
Summary
Development of robust and efficient methods for analyzing and interpreting high dimension gene expression profiles continues to be a focus in computational biology. Inspired by the insight that genes often interplay as a module to realize a highly integrated cellular function, we propose an alternative approach to analyzing the high dimension microarray data by formulating the disease classification problem from a perspective of modularity. Instead of analyzing raw expressions of single genes, we consider the gene expressions within a functional module as an integrative data point to shrink the feature dimension. This modular approach is flexible and statistically robust to high measurement noise and intrinsic biological variance inherent in microarray data. To obtain a robust and convincing comparison of FEP and GEP, we have undertaken analysis of two additional large-scale datasets and have described the detailed results in the supplement [see Additional file 1]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.