Abstract
Abstract Quality control and consistency tests on genotypes and historical pedigree data are applied in a routine genomic evaluation and academic research. The quality control takes more time to finish as more genotypes become available, and this step is a bottleneck in a pipeline of routine evaluation. For the efficient quality control, we have developed several algorithms and a computer program to support for large-scale, biallelic, single nucleotide polymorphisms (SNPs). The program is designed to detect unsatisfactory genomic markers and individuals in terms of call rate, marker allele frequencies, duplicate samples, and Mendelian inconsistency in the large genomic data with the pedigree including millions of individuals. Duplicated genotypes can be detected using a set of markers. An SNP genotype is packed into a 2-bit representation in memory that enables bitwise operations with parallel computing to efficiently perform the quality control. The software optionally checks the inconsistency of pedigree information. We compared QCF90 with preGSf90, a preceding program, in terms of memory usage and computing time using a data set including 200,000 genotyped individuals, 50,000 SNP markers per individual, and 216,500 pedigree individuals. In total running time, QCF90 was approximately 6 times faster than PREGSF90 (307 s vs 2075 s) while the memory usage was 30 times less (2 GB vs 75 GB) using only 1 thread. The QCF90 program performed better in speed as more threads were used. A check for genomic duplications took 159 s with 16 threads when 5,000 genotypes were compared with 200,000 genotypes using 2500 SNP markers. The new tool is useful in the routine genomic evaluation and the academic research in which both the genotypes and the pedigree information are used. The QCF90 executable is available at http://nce.ads.uga.edu with a user manual.
Accepted Version
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.