Abstract

BackgroundHigh-density short oligonucleotide microarrays are a primary research tool for assessing global gene expression. Background noise on microarrays comprises a significant portion of the measured raw data, which can have serious implications for the interpretation of the generated data if not estimated correctly.ResultsWe introduce an approach to calculate probe affinity based on sequence composition, incorporating nearest-neighbor (NN) information. Our model uses position-specific dinucleotide information, instead of the original single nucleotide approach, and adds up to 10% to the total variance explained (R2) when compared to the previously published model. We demonstrate that correcting for background noise using this approach enhances the performance of the GCRMA preprocessing algorithm when applied to control datasets, especially for detecting low intensity targets.ConclusionModifying the previously published position-dependent affinity model to incorporate dinucleotide information significantly improves the performance of the model. The dinucleotide affinity model enhances the detection of differentially expressed genes when implemented as a background correction procedure in GeneChip preprocessing algorithms. This is conceptually consistent with physical models of binding affinity, which depend on the nearest-neighbor stacking interactions in addition to base-pairing.

Highlights

  • High-density short oligonucleotide microarrays are a primary research tool for assessing global gene expression

  • Each pair is composed of a perfect match probe (PM), which exactly complements a region on the transcript, and a mismatch probe (MM), which is identical to the PM probe except at the 13th base, where the reverse complement nucleotide is introduced [2]

  • The model defined in equation 1 can be expressed as a polynomial of degree 3, reducing the free parameters from 100 to 16 as shown below: Figure 1 shows the 25 parameters of the four nucleotides as a function of their position along the probe for the U133 Latin square dataset

Read more

Summary

Introduction

High-density short oligonucleotide microarrays are a primary research tool for assessing global gene expression. The fluorescent signal from each probe, includes background noise that measures the transcript abundance, and non-specific binding (NSB) and autofluorescence of the chip surface. MM probes were originally introduced by Affymetrix to measure background noise. It has been shown by many groups that MM probes contain significant amount of the PM signal and are unreliable as estimators of background noise [3,4,5]. Using a more accurate estimate of background noise should improve the quality of Affymetrix GeneChip data. In GCRMA, Wu et al [10] model the signal intensity generated from each probe as: log10(per NN affinity) A) A C) G B) C.

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.