Novel approach for parallelizing pairwise comparison problems as applied to detecting segments identical by decent in whole-genome data.

Emmanuel Sapin,Matthew C Keller

doi:10.1093/bioinformatics/btab084

Emmanuel Sapin, Matthew C Keller

Open Access

https://doi.org/10.1093/bioinformatics/btab084

Copy DOI

Abstract

MotivationPairwise comparison problems arise in many areas of science. In genomics, datasets are already large and getting larger, and so operations that require pairwise comparisons—either on pairs of SNPs or pairs of individuals—are extremely computationally challenging. We propose a generic algorithm for addressing pairwise comparison problems that breaks a large problem (of order n2 comparisons) into multiple smaller ones (each of order n comparisons), allowing for massive parallelization.ResultsWe demonstrated that this approach is very efficient for calling identical by descent (IBD) segments between all pairs of individuals in the UK Biobank dataset, with a 250-fold savings in time and 750-fold savings in memory over the standard approach to detecting such segments across the full dataset. This efficiency should extend to other methods of IBD calling and, more generally, to other pairwise comparison tasks in genomics or other areas of science.Availability and ImplementationA GitHub page is available at https://github.com/emmanuelsapin with the code to generate data needed for the implementation

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Bioinformatics (Oxford, England)	Publication Date: Mar 11, 2021
Citations: 3	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Novel approach for parallelizing pairwise comparison problems as applied to detecting segments identical by decent in whole-genome data.

Abstract

Talk to us

Similar Papers

More From: Bioinformatics (Oxford, England)

Lead the way for us

Similar Papers

IBDkin: fast estimation of kinship coefficients from identity by descent segments.
Ying Zhou ... Brian L Browning
Bioinformatics | VOL. 36
Ying Zhou, et. al.Ying Zhou ... Brian L Browning
17 Jun 2020
Bioinformatics | VOL. 36

Assessment of a causal relationship between body mass index and atopic dermatitis
Ashley Budu-Aggrey ... Sara J Brown
Journal of Allergy and Clinical Immunology | VOL. 147
Ashley Budu-Aggrey, et. al.Ashley Budu-Aggrey ... Sara J Brown
17 May 2020
Journal of Allergy and Clinical Immunology | VOL. 147

Using identity by descent estimation with dense genotype data to detect positive selection
Lide Han ... Mark Abney
European Journal of Human Genetics | VOL. 21
Lide Han, et. al.Lide Han ... Mark Abney
11 Jul 2012
European Journal of Human Genetics | VOL. 21

Comparison study in determination of full sibling with Identifiler multiplex system between ITO method and identity by state scoring method
Shumin Zhao ... Chengtao Li
Forensic Science International: Genetics Supplement Series | VOL. 3
Shumin Zhao, et. al.Shumin Zhao ... Chengtao Li
25 Nov 2011
Forensic Science International: Genetics Supplement Series | VOL. 3

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Novel approach for parallelizing pairwise comparison problems as applied to detecting segments identical by decent in whole-genome data.

Abstract

Talk to us

Similar Papers

More From: Bioinformatics (Oxford, England)