Abstract

Cancer is a complex disease that cannot be diagnosed reliably using only single gene expression analysis. Using gene-set analysis on high throughput gene expression profiling controlled by various environmental factors is a commonly adopted technique used by the cancer research community. This work develops a comprehensive gene expression analysis tool (gene-set activity toolbox: (GAT)) that is implemented with data retriever, traditional data pre-processing, several gene-set analysis methods, network visualization and data mining tools. The gene-set analysis methods are used to identify subsets of phenotype-relevant genes that will be used to build a classification model. To evaluate GAT performance, we performed a cross-dataset validation study on three common cancers namely colorectal, breast and lung cancers. The results show that GAT can be used to build a reasonable disease diagnostic model and the predicted markers have biological relevance. GAT can be accessed from http://gat.sit.kmutt.ac.th where GAT's java library for gene-set analysis, simple classification and a database with three cancer benchmark datasets can be downloaded.

Highlights

  • Despite signicant advances in early cancer detection, treatment, and prevention, we still observe high incidence and mortality rates for all types of cancers,[1] with even greater risks as we get older.[2]

  • The gene-sets identied by genenetwork-based feature set (GNFS), PFSNet and ANOVA-based feature set method (AFS) with dened parameters are compared

  • This study demonstrates a gene-set-based microarray data analysis tool using the newly developed gene-set activity toolbox (GAT) on six microarray datasets

Read more

Summary

Introduction

Despite signicant advances in early cancer detection, treatment, and prevention, we still observe high incidence and mortality rates for all types of cancers,[1] with even greater risks as we get older.[2] With the advent of an aging society, accurate cancer diagnosis is needed in order for early and e®ective detection of cancer conditions and to reduce the underlying mortality rate.[3,4,5] Large numbers of microarray datasets have continually been generated and deposited into public databases. The performance of the CORGs-based method was evaluated with many cancer datasets This transformed activity was shown to be robust and provided more discriminative power than the use of gene expression level. Subsequent works focus on using a subset of phenotypecorrelated genes (PCOGs) to uniquely identify pathways related to breast

Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.