Abstract

In this article, we walk through an end-to-end Affymetrix microarray differential expression workflow using Bioconductor packages. This workflow is directly applicable to current "Gene'' type arrays, e.g.the HuGene or MoGene arrays, but can easily be adapted to similar platforms. The data analyzed here is a typical clinical microarray data set that compares inflamed and non-inflamed colon tissue in two disease subtypes. For each disease, the differential gene expression between inflamed- and non-inflamed colon tissue was analyzed. We will start from the raw data CEL files, show how to import them into a Bioconductor ExpressionSet, perform quality control and normalization and finally differential gene expression (DE) analysis, followed by some enrichment analysis.

Highlights

  • A section on the use of False Discovery Rate (FDR) control in multiple testing problems has been added: We illustrate the benefits of FDR control using p-values from the data at hand

  • In this article we introduce a complete workflow for a typical (Affymetrix) microarray analysis

  • The data set used1 is from a paper studying the differences in gene expression in inflamed and non-inflamed tissue. patients suffering from Ulcerative colitis (UC) and patients with Crohn’s disease (CD) were tested, and from each patient inflamed and non-inflamed colonic mucosa tissue was obtained via a biopsy

Read more

Summary

15 Jun 2016 report report

The individual workflow steps contain more detailed discussions of the code. We try to explain the statistics behind the individual steps carefully and in a non-technical way. Relative Log Expression (RLE) analysis has been implemented as another data quality control step. The method for intensity-based filtering of genes prior to the differential expression analysis has been changed. Instead, we explain the principles of differential expression analysis using a specific gene. We fit the linear model to this gene, explain its rationale and compare it to the standard t test. A section on the use of False Discovery Rate (FDR) control in multiple testing problems has been added: We illustrate the benefits of FDR control using p-values from the data at hand.

Introduction
19 GO:0006067 ethanol metabolic process
12. Smyth GK
19. Gene Ontology Consortium
Findings
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.