Abstract

Gene set analysis (GSA) incorporates biological with statistical knowledge to identify gene sets which are differentially expressed that between two or more phenotypes. In this paper gene sets differentially expressed between acute lymphoblastic leukaemia (ALL) with BCR-ABL and those with no observed cytogenetic abnormalities were determined by GSA methods. The BCR-ABL is an abnormal gene found in some people with ALL. The results of two GSAs showed that the Category test identified 30 gene sets differentially expressed between two phenotypes, while the Hotelling's T2 could discover just 19 gene sets. On the other hand, assessment of common genes among significant gene sets showed that there were high agreement between the results of GSA and the findings of biologists. In addition, the performance of these methods was compared by simulated and ALL data. The results on simulated data indicated decrease in the type I error rate and increase the power in multivariate (Hotelling's T2) test as increasing the correlation between gene pairs in contrast to the univariate (Category) test.

Highlights

  • Microarray technology is allowing researchers to measure the expression of thousands of genes simultaneously which this has translated a tool for identifying genes that have been expressed differentially among different phenotypes

  • Materials and Methods: In this paper gene sets differentially expressed between acute lymphoblastic leukaemia (ALL) with BCR-ABL and those with no observed cytogenetic abnormalities were determined by Gene set analysis (GSA) methods

  • In this study we evaluated two groups by simulated and acute lymphoblastic leukemia (ALL) microarray dataset with use of the Category and Hotelling’s T2 approaches

Read more

Summary

Introduction

Microarray technology is allowing researchers to measure the expression of thousands of genes simultaneously which this has translated a tool for identifying genes that have been expressed differentially among different phenotypes. The main attention of the researchers is to translate such lists into a better understanding of the underlying biological phenomena related to interest phenotypes This is the starting point for Gene Set Analyses (GSA) to incorporate biological into statistical knowledge (EmmertStreib and Glazko, 2011). Some researchers used tests based on contingency tables such as chi-square, FisherH[DFWWHVW FRXOG QRW ÀQG WKH VPDOO GLIIHUHQFHV EHWZHHQ phenotypes. This subgroup was called Overrepresentation (Man et al, 2000, Al-Shahrour et al, 2004; Khatri and Draghici, 2005). Conclusions: The results on simulated data indicated decrease in the type I error rate and increase the power in multivariate (Hotelling’s T2) test as increasing the correlation between gene pairs in contrast to the univariate (Category) test.

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.