Interpolation based consensus clustering for gene expression time series.

Tai-Yu Chiu,Ting-Chieh Hsu,Jia-Shung Wang,Chia-Cheng Yen

doi:10.1186/s12859-015-0541-0

Tai-Yu Chiu, Ting-Chieh Hsu + Show 2 more

Open Access

https://doi.org/10.1186/s12859-015-0541-0

Copy DOI

Journal: BMC Bioinformatics	Publication Date: Apr 16, 2015
Citations: 47	License type: CC BY 4.0

Affiliation: National Tsing Hua University

Abstract

BackgroundUnsupervised analyses such as clustering are the essential tools required to interpret time-series expression data from microarrays. Several clustering algorithms have been developed to analyze gene expression data. Early methods such as k-means, hierarchical clustering, and self-organizing maps are popular for their simplicity. However, because of noise and uncertainty of measurement, these common algorithms have low accuracy. Moreover, because gene expression is a temporal process, the relationship between successive time points should be considered in the analyses. In addition, biological processes are generally continuous; therefore, the datasets collected from time series experiments are often found to have an insufficient number of data points and, as a result, compensation for missing data can also be an issue.ResultsAn affinity propagation-based clustering algorithm for time-series gene expression data is proposed. The algorithm explores the relationship between genes using a sliding-window mechanism to extract a large number of features. In addition, the time-course datasets are resampled with spline interpolation to predict the unobserved values. Finally, a consensus process is applied to enhance the robustness of the method. Some real gene expression datasets were analyzed to demonstrate the accuracy and efficiency of the algorithm.ConclusionThe proposed algorithm has benefitted from the use of cubic B-splines interpolation, sliding-window, affinity propagation, gene relativity graph, and a consensus process, and, as a result, provides both appropriate and effective clustering of time-series gene expression data. The proposed method was tested with gene expression data from the Yeast galactose dataset, the Yeast cell-cycle dataset (Y5), and the Yeast sporulation dataset, and the results illustrated the relationships between the expressed genes, which may give some insights into the biological processes involved.

Highlights

Unsupervised analyses such as clustering are the essential tools required to interpret time-series expression data from microarrays
Here, the proposed algorithm based on B-splines interpolation [10], affinity propagation [12], and consensus clustering [14] is described
The time-course gene expression clustering problem was formulated as follows: for a set of genes G = {G1, G2, . . . , Gn} where n is the number of genes, and each gene Gi includes τ time points for the gene expression values, the n genes are grouped into K disjoint clusters C1, C2, . . . , CK

Summary

Introduction

Unsupervised analyses such as clustering are the essential tools required to interpret time-series expression data from microarrays. Several clustering algorithms have been developed to analyze gene expression data. Methods such as k-means, hierarchical clustering, and self-organizing maps are popular for their simplicity. High-throughput data of time-series gene expression are recorded to explore the complex dynamics of biological systems. Analyses of microarray data are essential in several time-series expression experiments such as biological systems, infectious diseases, and genetic interactions [1]. Pattern recognition techniques are helpful [2] to explore and exploit high-throughput screening data from microarrays By using these techniques, similar expression patterns can be organized into a group. Some of the older methods such as k-means, hierarchical clustering, and

Objectives

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Interpolation based consensus clustering for gene expression time series.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

Improved Affinity Propagation by Spline Interpolation on Time-Series Gene Expression Clustering

-

01 Jan 2009
01 Jan 2009

An ensemble learning approach to reverse-engineering transcriptional regulatory networks from time-series gene expression data
Jianhua Ruan ... Edward J Perkins
BMC Genomics | VOL. 10
Jianhua Ruan, et. al.Jianhua Ruan ... Edward J Perkins
01 Jan 2009
BMC Genomics | VOL. 10

New algorithms for inferring gene regulatory networks from time-series expression data on Apache Spark
Jason T.L Wang ... Yasser Abduallah
International Journal of Big Data Intelligence | VOL. 6
Jason T.L Wang, et. al.Jason T.L Wang ... Yasser Abduallah
01 Jan 2019
International Journal of Big Data Intelligence | VOL. 6

Constrained Fourier estimation of short-term time-series gene expression data reduces noise and improves clustering and gene regulatory network predictions
Nadav Bar ... Naresh Doni Jayavelu
BMC Bioinformatics | VOL. 23
Nadav Bar, et. al.Nadav Bar ... Naresh Doni Jayavelu
09 Aug 2022
BMC Bioinformatics | VOL. 23

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Interpolation based consensus clustering for gene expression time series.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics