Abstract
DNA-based microarrays are increasingly central to biomedical research. Selecting oligonucleotide sequences that will behave consistently across experiments is essential to the design, production and performance of DNA microarrays. Here our aim was to improve on probe design parameters by empirically and systematically evaluating probe performance in a multivariate context. We used experimental data from 19 array CGH hybridizations to assess the probe performance of 385,474 probes tiled in the Duchenne muscular dystrophy (DMD) region of the X chromosome. Our results demonstrate that probe melting temperature, single nucleotide polymorphisms (SNPs), and homocytosine motifs all have a strong effect on probe behavior. These findings, when incorporated into future microarray probe selection algorithms, may improve microarray performance for a wide variety of applications.
Highlights
We explored whether the presence and length of homoadenine, homocytosine, homoguanine, or homothymidine sequence motifs could influence probe performance
Our results indicate that Tm, the presence of a single nucleotide polymorphisms (SNPs), and the presence of homocytosine motifs all influence probe behavior
We propose that the variance in the log(2) ratio across multiple experiments captures poor probe behavior, such as unreliable binding by target sequences as well as low target capture, since at low signals the log(2) variance is often inflated
Summary
DNA-based microarrays have become central to current biomedical research for a host of diverse applications[1], ranging from assessment of genomic copy number (array CGH)[2,3,4] and identification of transcription binding sites (ChIP-chip)[5,6,7] to resequencing[8,9,10,11,12] and SNP genotyping[13,14,15,16,17]. Graf and coworkers expanded on these results by developing a probe uniqueness score (U) based on the number of unique substrings of sequence within a given target region[22]. This group developed a probe-selecting algorithm incorporating U, melting temperature (Tm), and synthesis cycle number with sequence-specific filters. With this algorithm they have demonstrated acceptable coverage of the mouse genome[22]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.