Abstract

BackgroundIt has been a long-standing biological challenge to understand the molecular regulatory mechanisms behind mammalian ageing. Harnessing the availability of many ageing microarray datasets, a number of studies have shown that it is possible to identify genes that have age-dependent differential expression (DE) or differential variability (DV) patterns. The majority of the studies identify "interesting" genes using a linear regression approach, which is known to perform poorly in the presence of outliers or if the underlying age-dependent pattern is non-linear. Clearly a more robust and flexible approach is needed to identify genes with various age-dependent gene expression patterns.ResultsHere we present a novel model selection approach to discover genes with linear or non-linear age-dependent gene expression patterns from microarray data. To identify DE genes, our method fits three quantile regression models (constant, linear and piecewise linear models) to the expression profile of each gene, and selects the least complex model that best fits the available data. Similarly, DV genes are identified by fitting and comparing two quantile regression models (non-DV and the DV models) to the expression profile of each gene. We show that our approach is much more robust than the standard linear regression approach in discovering age-dependent patterns. We also applied our approach to analyze two human brain ageing datasets and found many biologically interesting gene expression patterns, including some very interesting DV patterns, that have been overlooked in the original studies. Furthermore, we propose that our model selection approach can be extended to discover DE and DV genes from microarray datasets with discrete class labels, by considering different quantile regression models.ConclusionIn this paper, we present a novel application of quantile regression models to identify genes that have interesting linear or non-linear age-dependent expression patterns. One important contribution of this paper is to introduce a model selection approach to DE and DV gene identification, which is most commonly tackled by null hypothesis testing approaches. We show that our approach is robust in analyzing real and simulated datasets. We believe that our approach is applicable in many ageing or time-series data analysis tasks.

Highlights

  • Introduction to quantile regressionThe standard linear regression approach aims to estimate a conditional mean function of y = f(x) given any x

  • In this paper, we present a novel application of quantile regression models to identify genes that have interesting linear or non-linear age-dependent expression patterns

  • One important contribution of this paper is to introduce a model selection approach to differential expression (DE) and differential variability (DV) gene identification, which is most commonly tackled by null hypothesis testing approaches

Read more

Summary

Introduction

Quantile regression has been recently applied to various areas of bioinformatics, such as visualization of array Comparative Genomic Hybridization (CGH) data [19,20], identification of differentially expressed genes in two-color microarray datasets [21] and outlier detection in mass spectrometry data [22]. It has been a long-standing biological challenge to understand the molecular regulatory mechanisms behind mammalian ageing. Given the expression profile of a gene in the form of {(xi , yi)}in=1 , the parameter vector θ can be estimated by the method of ordinary least squares, which can be written as the following minimization problem: n

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call