Abstract

DNA-based microarrays are increasingly central to biomedical research. Selecting oligonucleotide sequences that will behave consistently across experiments is essential to the design, production and performance of DNA microarrays. Here our aim was to improve on probe design parameters by empirically and systematically evaluating probe performance in a multivariate context. We used experimental data from 19 array CGH hybridizations to assess the probe performance of 385,474 probes tiled in the Duchenne muscular dystrophy (DMD) region of the X chromosome. Our results demonstrate that probe melting temperature, single nucleotide polymorphisms (SNPs), and homocytosine motifs all have a strong effect on probe behavior. These findings, when incorporated into future microarray probe selection algorithms, may improve microarray performance for a wide variety of applications.

Highlights

  • We explored whether the presence and length of homoadenine, homocytosine, homoguanine, or homothymidine sequence motifs could influence probe performance

  • Our results indicate that Tm, the presence of a single nucleotide polymorphisms (SNPs), and the presence of homocytosine motifs all influence probe behavior

  • We propose that the variance in the log(2) ratio across multiple experiments captures poor probe behavior, such as unreliable binding by target sequences as well as low target capture, since at low signals the log(2) variance is often inflated

Read more

Summary

Introduction

DNA-based microarrays have become central to current biomedical research for a host of diverse applications[1], ranging from assessment of genomic copy number (array CGH)[2,3,4] and identification of transcription binding sites (ChIP-chip)[5,6,7] to resequencing[8,9,10,11,12] and SNP genotyping[13,14,15,16,17]. Graf and coworkers expanded on these results by developing a probe uniqueness score (U) based on the number of unique substrings of sequence within a given target region[22]. This group developed a probe-selecting algorithm incorporating U, melting temperature (Tm), and synthesis cycle number with sequence-specific filters. With this algorithm they have demonstrated acceptable coverage of the mouse genome[22]

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call