Abstract

DNA microarray technology provides a powerful vehicle for exploring biological processes on a genomic scale. Machine-learning approaches such as association rule mining (ARM) have been proven very effective in extracting biologically relevant associations among different genes. Despite of the usefulness of ARM, time relations among associated genes cannot be modeled with a standard ARM approach, though temporal information is critical for the understanding of regulatory mechanisms in biological processes. Sequential rule mining (SRM) methods have been proposed for mining temporal relations in temporal data instead. Although successful, existing SRM applications on temporal microarray data have been exclusively designed for in vitro experiments in yeast and none extension to in vivo data sets has been proposed to date. Contrary to what happen with in vitro experiments, when dealing with microarray data derived from humans or animals the “subject variability” is the main issue to address, so that databases include multiple sequences instead of a single one. A wide variety of SRM approaches could be used to handle with these particularities. In this study, we propose an adaptation of the particular SRM method “CMRules” to extract sequential association rules from temporal gene expression data derived from humans. In addition to the data mining process, we further propose the validation of extracted rules through the integration of results along with external resources of biological knowledge (functional and pathway annotation databases). The employed data set consists on temporal gene expression data collected in three different time points during the course of a dietary intervention in 57 subjects with obesity (data set available with identifier GSE77962 in the Gene Expression Omnibus repository). Published by Vink [1], the original clinical trial investigated the effects on weight loss of two different dietary interventions (a low-calorie diet or a very low-calorie diet). In conclusion, the proposed method demonstrated a good ability to extract sequential association rules with further biological relevance within the context of obesity. Thus, the application of this method could be successfully extended to other longitudinal microarray data sets derived from humans.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call