Abstract

Background: The genomics and microarray technology played tremendous roles in the amount of biologically useful information on gene expression of thousands of genes to be simultaneously observed. This required various computational methods of analyzing these amounts of data in order to discover information about gene function and regulatory mechanisms. Methods: In this research, we investigated the usefulness of hidden markov models (HMM) as a method of clustering Plasmodium falciparum genes that show similar expression patterns. The Baum-Welch algorithm was used to train the dataset to determine the maximum likelihood estimate of the Model parameters. Cluster validation was conducted by performing a likelihood ratio test. Results: The fitted HMM was able to identify 3 clusters from the dataset and sixteen functional enrichment in the cluster set were found. This method efficiently clustered the genes based on their expression pattern while identifying erythrocyte membrane protein 1 as a prominent and diverse protein in P. falciparum. Conclusion: The ability of HMM to identify 3 clusters with sixteen functional enrichment from the 2000 genes makes this a useful method in functional cluster analysis for P. falciparum

Highlights

  • Technological advancement in bioinformatics such as the high through put sequencing technology has resulted in the availability of a very large amount of informative data[1]

  • Microarray consists of many thousands of short, single stranded sequences, each immobilized as individual elements on a solid support, that are complementary to the cDNA strand representing a single gene

  • The clustering algorithms was implemented by first randomly initializing all the hidden markov models (HMM) parameters, forward algorithm was implemented to calculate the forward probabilities of the observation and Baum-Welch algorithm was used for data fitting, the likelihood of each HMM was calculated iteratively until the optimal likelihood is obtained

Read more

Summary

Introduction

Technological advancement in bioinformatics such as the high through put sequencing technology has resulted in the availability of a very large amount of informative data[1]. The genomics and microarray technology played tremendous roles in the amount of biologically useful information on gene expression of thousands of genes to be simultaneously observed. This required various computational methods of analyzing these amounts of data in order to discover information about gene function and regulatory mechanisms. Results: The fitted HMM was able to identify 3 clusters from the dataset and sixteen functional enrichment in the cluster set were found This method efficiently clustered the genes based on their expression pattern while identifying erythrocyte membrane protein 1 as a prominent and diverse protein in P. falciparum. Conclusion: The ability of HMM to identify 3 clusters with sixteen functional enrichment from the 2000 genes makes this a useful method in functional cluster analysis for P. falciparum

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.