Abstract
BackgroundPLINK is probably the most used program for analyzing SNP genotypes and runs of homozygosity (ROH), both in human and in animal populations. The last decade, ROH analyses have become the state-of-the-art method for inbreeding assessment. In PLINK, the --homozyg function is used to perform ROH analyses and relies on several input settings. These settings can have a large impact on the outcome and default values are not always appropriate for medium density SNP array data. Guidelines for a robust and uniform ROH analysis in PLINK using medium density data are lacking, albeit these guidelines are vital for comparing different ROH studies. In this study, 8 populations of different livestock and pet species are used to demonstrate the importance of PLINK input settings. Moreover, the effects of pruning SNPs for low minor allele frequencies and linkage disequilibrium on ROH detection are shown.ResultsWe introduce the genome coverage parameter to appropriately estimate FROH and to check the validity of ROH analyses. The effect of pruning for linkage disequilibrium and low minor allele frequencies on ROH analyses is highly population dependent and such pruning may result in missed ROH. PLINK’s minimal density requirement is crucial for medium density genotypes and if set too low, genome coverage of the ROH analysis is limited. Finally, we provide recommendations for the maximal gap, scanning window length and threshold settings.ConclusionsIn this study, we present guidelines for an adequate and robust ROH analysis in PLINK on medium density SNP data. Furthermore, we advise to report parameter settings in publications, and to validate them prior to analysis. Moreover, we encourage authors to report genome coverage to reflect the ROH analysis’ validity. Implementing these guidelines will substantially improve the overall quality and uniformity of ROH analyses.
Highlights
PLINK is probably the most used program for analyzing SNP genotypes and runs of homozygosity (ROH), both in human and in animal populations
Results and figures for PIT, Belgian Blue cattle population (BB), Australian polled Merino sheep population (MER) and Burmese cat population (BUR) are provided in the main manuscript, whereas results for SAA, Swedish bred Icelandic horse population (ICE), Labrador dog population (LAB) and Barnevelder chicken population (BAR) can be found in Additional files 1, 2, 3, 4, 5, 6 and 7
Pruning for linkage disequilibrium The results of pruning for varying LD levels prior to ROH analysis for PIT, BB, MER and BUR are shown in Fig. 1, results for SAA, ICE, LAB and BAR are added in Additional file 2: Figure S1
Summary
PLINK is probably the most used program for analyzing SNP genotypes and runs of homozygosity (ROH), both in human and in animal populations. The last decade, ROH analyses have become the state-of-the-art method for inbreeding assessment. In PLINK, the --homozyg function is used to perform ROH analyses and relies on several input settings. These settings can have a large impact on the outcome and default values are not always appropriate for medium density SNP array data. Runs of homozygosity (ROH) are the state-of-the-art method for inbreeding analyses in livestock populations [1]. Whereas short ROH are indicators of distant inbreeding, PLINK [7, 8] is the most used program for ROH analyses in livestock populations [1]. The defined window stepwise scans an individual’s genome and scores for each SNP the proportion it appears in a homozygous window
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.