Identical By Descent Segments Research Articles

The determination of the relationship between a pair of individuals is a fundamental application of genetics. Previously, we and others have demonstrated that identity-by-descent (IBD) information generated from high-density single-nucleotide polymorphism (SNP) data can greatly improve the power and accuracy of genetic relationship detection. Whole-genome sequencing (WGS) marks the final step in increasing genetic marker density by assaying all single-nucleotide variants (SNVs), and thus has the potential to further improve relationship detection by enabling more accurate detection of IBD segments and more precise resolution of IBD segment boundaries. However, WGS introduces new complexities that must be addressed in order to achieve these improvements in relationship detection. To evaluate these complexities, we estimated genetic relationships from WGS data for 1490 known pairwise relationships among 258 individuals in 30 families along with 46 population samples as controls. We identified several genomic regions with excess pairwise IBD in both the pedigree and control datasets using three established IBD methods: GERMLINE, fastIBD, and ISCA. These spurious IBD segments produced a 10-fold increase in the rate of detected false-positive relationships among controls compared to high-density microarray datasets. To address this issue, we developed a new method to identify and mask genomic regions with excess IBD. This method, implemented in ERSA 2.0, fully resolved the inflated cryptic relationship detection rates while improving relationship estimation accuracy. ERSA 2.0 detected all 1st through 6th degree relationships, and 55% of 9th through 11th degree relationships in the 30 families. We estimate that WGS data provides a 5% to 15% increase in relationship detection power relative to high-density microarray data for distant relationships. Our results identify regions of the genome that are highly problematic for IBD mapping and introduce new software to accurately detect 1st through 9th degree relationships from whole-genome sequence data.

Read full abstract

Identity by descent (IBD) can be reliably detected for long shared DNA segments, which are found in related individuals. However, many studies contain cohorts of unrelated individuals that share only short IBD segments. New sequencing technologies facilitate identification of short IBD segments through rare variants, which convey more information on IBD than common variants. Current IBD detection methods, however, are not designed to use rare variants for the detection of short IBD segments. Short IBD segments reveal genetic structures at high resolution. Therefore, they can help to improve imputation and phasing, to increase genotyping accuracy for low-coverage sequencing and to increase the power of association studies. Since short IBD segments are further assumed to be old, they can shed light on the evolutionary history of humans. We propose HapFABIA, a computational method that applies biclustering to identify very short IBD segments characterized by rare variants. HapFABIA is designed to detect short IBD segments in genotype data that were obtained from next-generation sequencing, but can also be applied to DNA microarray data. Especially in next-generation sequencing data, HapFABIA exploits rare variants for IBD detection. HapFABIA significantly outperformed competing algorithms at detecting short IBD segments on artificial and simulated data with rare variants. HapFABIA identified 160 588 different short IBD segments characterized by rare variants with a median length of 23 kb (mean 24 kb) in data for chromosome 1 of the 1000 Genomes Project. These short IBD segments contain 752 000 single nucleotide variants (SNVs), which account for 39% of the rare variants and 23.5% of all variants. The vast majority—152 000 IBD segments—are shared by Africans, while only 19 000 and 11 000 are shared by Europeans and Asians, respectively. IBD segments that match the Denisova or the Neandertal genome are found significantly more often in Asians and Europeans but also, in some cases exclusively, in Africans. The lengths of IBD segments and their sharing between continental populations indicate that many short IBD segments from chromosome 1 existed before humans migrated out of Africa. Thus, rare variants that tag these short IBD segments predate human migration from Africa. The software package HapFABIA is available from Bioconductor. All data sets, result files and programs for data simulation, preprocessing and evaluation are supplied at http://www.bioinf.jku.at/research/short-IBD.

Read full abstract

Identical By Descent Segments Research Articles

Related Topics

Articles published on Identical By Descent Segments

A fast and accurate method for detection of IBD shared haplotypes in genome-wide SNP data.

Exploring Identity-By-Descent Segments and Putative Functions Using Different Foundation Parents in Maize.

IBD Sharing between Africans, Neandertals, and Denisovans.

Robust Inference of Identity by Descent from Exome-Sequencing Data

A Genealogical Look at Shared Ancestry on the X Chromosome.

Rapidly Registering Identity-by-Descent Across Ancestral Recombination Graphs.

Atlas of Cryptic Genetic Relatedness Among 1000 Human Genomes

Leveraging Distant Relatedness to Quantify Human Mutation and Gene-Conversion Rates

PIGS: improved estimates of identity-by-descent probabilities by probabilistic IBD graph sampling.

Genotyping of geographically diverse Druze trios reveals substructure and a recent bottleneck.

Parente2: a fast and accurate method for detecting identity by descent

A renewal theory approach to IBD sharing

Identity-by-descent approaches identify regions of importance for genetic susceptibility to hereditary esophageal squamous cell carcinoma.

Genome-wide mapping of IBD segments in an Ashkenazi PD cohort identifies associated haplotypes.

Reducing pervasive false-positive identical-by-descent segments detected by large-scale pedigree analysis.

An Effective Filter for IBD Detection in Large Data Sets

Relationship estimation from whole-genome sequence data.

Efficient clustering of identity-by-descent between multiple individuals

HapFABIA: Identification of very short segments of identity by descent characterized by rare variants in large sequencing data

Detecting Identity by Descent and Estimating Genotype Error Rates in Sequence Data

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Identical By Descent Segments Research Articles

Related Topics

Articles published on Identical By Descent Segments

A fast and accurate method for detection of IBD shared haplotypes in genome-wide SNP data.

Exploring Identity-By-Descent Segments and Putative Functions Using Different Foundation Parents in Maize.

IBD Sharing between Africans, Neandertals, and Denisovans.

Robust Inference of Identity by Descent from Exome-Sequencing Data

A Genealogical Look at Shared Ancestry on the X Chromosome.

Rapidly Registering Identity-by-Descent Across Ancestral Recombination Graphs.

Atlas of Cryptic Genetic Relatedness Among 1000 Human Genomes

Leveraging Distant Relatedness to Quantify Human Mutation and Gene-Conversion Rates

PIGS: improved estimates of identity-by-descent probabilities by probabilistic IBD graph sampling.

Genotyping of geographically diverse Druze trios reveals substructure and a recent bottleneck.

Parente2: a fast and accurate method for detecting identity by descent

A renewal theory approach to IBD sharing

Identity-by-descent approaches identify regions of importance for genetic susceptibility to hereditary esophageal squamous cell carcinoma.

Genome-wide mapping of IBD segments in an Ashkenazi PD cohort identifies associated haplotypes.

Reducing pervasive false-positive identical-by-descent segments detected by large-scale pedigree analysis.

An Effective Filter for IBD Detection in Large Data Sets

Relationship estimation from whole-genome sequence data.

Efficient clustering of identity-by-descent between multiple individuals

HapFABIA: Identification of very short segments of identity by descent characterized by rare variants in large sequencing data

Detecting Identity by Descent and Estimating Genotype Error Rates in Sequence Data