PAIR: paired allelic log-intensity-ratio-based normalization method for SNP-CGH arrays

Shengping Yang,Kun Zhang,Stanley Pounds,Zhide Fang

doi:10.1093/bioinformatics/bts683

Abstract

Normalization is critical in DNA copy number analysis. We propose a new method to correctly identify two-copy probes from the genome to obtain representative references for normalization in single nucleotide polymorphism arrays. The method is based on a two-state Hidden Markov Model. Unlike most currently available methods in the literature, the proposed method does not need to assume that the percentage of two-copy state probes is dominant in the genome, as long as there do exist two-copy probes. The real data analysis and simulation study show that the proposed algorithm is successful in that (i) it performs as well as the current methods (e.g. CGHnormaliter and popLowess) for samples with dominant two-copy states and outperforms these methods for samples with less dominant two-copy states; (ii) it can identify the copy-neutral loss of heterozygosity; and (iii) it is efficient in terms of the computational time used. R scripts are available at http://publichealth.lsuhsc.edu/PAIR.html.

Full Text