Clustering of diverse genomic data using information fusion

Jyotsna Kasturi,Raj Acharya

doi:10.1093/bioinformatics/bti186

Abstract

Genome sequencing projects and high-through-put technologies like DNA and Protein arrays have resulted in a very large amount of information-rich data. Microarray experimental data are a valuable, but limited source for inferring gene regulation mechanisms on a genomic scale. Additional information such as promoter sequences of genes/DNA binding motifs, gene ontologies, and location data, when combined with gene expression analysis can increase the statistical significance of the finding. This paper introduces a machine learning approach to information fusion for combining heterogeneous genomic data. The algorithm uses an unsupervised joint learning mechanism that identifies clusters of genes using the combined data. The correlation between gene expression time-series patterns obtained from different experimental conditions and the presence of several distinct and repeated motifs in their upstream sequences is examined here using publicly available yeast cell-cycle data. The results show that the combined learning approach taken here identifies correlated genes effectively. The algorithm provides an automated clustering method, but allows the user to specify apriori the influence of each data type on the final clustering using probabilities. Software code is available by request from the first author. jkasturi@cse.psu.edu.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Clustering of diverse genomic data using information fusion

Abstract

Talk to us

Similar Papers

More From: Bioinformatics

Lead the way for us

Journal: Bioinformatics	Publication Date: Dec 17, 2004
Citations: 29

Similar Papers

Clustering of diverse genomic data using information fusion
Jyotsna Kasturi ... Raj Acharya
-
Jyotsna Kasturi, et. al.Jyotsna Kasturi ... Raj Acharya
14 Mar 2004
14 Mar 2004

Minimal metabolic pathway structure is consistent with associated biomolecular interactions.
Aarash Bordbar ... Nathan E Lewis
Molecular systems biology | VOL. 10
Aarash Bordbar, et. al.Aarash Bordbar ... Nathan E Lewis
01 Jul 2014
Molecular systems biology | VOL. 10

Dealing with missing values in microarray data
Azadeh Mohammadi ... Mohammad Hossein Saraee
-
Azadeh Mohammadi, et. al.Azadeh Mohammadi ... Mohammad Hossein Saraee
01 Oct 2008
01 Oct 2008

Estimating Missing Value in Microarray Data Using Fuzzy Clustering and Gene Ontology
Azadeh Mohammadi ... Mohammad Hossein Saraee
-
Azadeh Mohammadi, et. al.Azadeh Mohammadi ... Mohammad Hossein Saraee
01 Jan 2008
01 Jan 2008

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Clustering of diverse genomic data using information fusion

Abstract

Talk to us

Similar Papers

More From: Bioinformatics