Abstract

Genotyping arrays characterize genome-wide SNPs for a study cohort and were the primary technology behind genome wide association studies over the last decade. The Cancer Genome Atlas (TCGA) is one of the largest cancer consortium studies, and it collected genotyping data for all of its participants. Using TCGA SNP data genotyped using the Affymetrix 6.0 SNP array from 12,064 samples, we conducted a comprehensive comparisons across DNA sources (tumor tissue, normal tissue, and blood) and sample storage protocols (formalin-fixed paraffin-embedded (FFPE) vs. freshly frozen (FF)), examining genotypes, transition/transversion ratios, and mutation catalogues. During the analysis, we made important observations in relevance to the data quality issues. SNP concordance was excellent between blood and normal tissues, and slightly lower between blood and tumor tissue due to potential somatic mutations in the tumors. The observed poor SNP concordance between FFPE and FF samples suggested a batch effect. The transition/transversion ratio, a metric commonly used for quality control purpose in exome sequencing projects, appeared less applicable for genotyping array data due to the whole-genome coverage built into the array design. Moreover, there were substantially more loss of heterozygosity events than gain of heterozygosity when comparing tumors relative to normal tissues and blood. This might be a consequence of extensive copy number deletions in tumors. In summary, our thorough evaluation calls for more adequate quality control practices and provides guidelines for improved application of TCGA genotyping data.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.