Empirical Bayes single nucleotide variant-calling for next-generation sequencing data

Ali Karimnezhad,Theodore J Perkins

doi:10.1038/s41598-024-51958-z

Abstract

One of the fundamental computational problems in cancer genomics is the identification of single nucleotide variants (SNVs) from DNA sequencing data. Many statistical models and software implementations for SNV calling have been developed in the literature, yet, they still disagree widely on real datasets. Based on an empirical Bayesian approach, we introduce a local false discovery rate (LFDR) estimator for germline SNV calling. Our approach learns model parameters without prior information, and simultaneously accounts for information across all sites in the genomic regions of interest. We also propose another LFDR-based algorithm that reliably prioritizes a given list of mutations called by any other variant-calling algorithm. We use a suite of gold-standard cell line data to compare our LFDR approach against a collection of widely used, state of the art programs. We find that our LFDR approach approximately matches or exceeds the performance of all of these programs, despite some very large differences among them. Furthermore, when prioritizing other algorithms’ calls by our LFDR score, we find that by manipulating the type I-type II tradeoff we can select subsets of variant calls with minimal loss of sensitivity but dramatic increases in precision.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Empirical Bayes single nucleotide variant-calling for next-generation sequencing data

Abstract

Talk to us

Similar Papers

More From: Scientific Reports

Lead the way for us

Journal: Scientific Reports	Publication Date: Jan 18, 2024
License type: CC BY 4.0

Similar Papers

Parametric Estimation of the Local False Discovery Rate for Identifying Genetic Associations
Ye Yang ... Farnoosh Abbas Aghababazadeh
IEEE/ACM Transactions on Computational Biology and Bioinformatics | VOL. 10
Ye Yang, et. al. Ye Yang ... Farnoosh Abbas Aghababazadeh
01 Jan 2013
IEEE/ACM Transactions on Computational Biology and Bioinformatics | VOL. 10

SNV identification from single-cell RNA sequencing data.
Patricia M Schnepp ... Mengjie Chen
Human Molecular Genetics | VOL. 28
Patricia M Schnepp, et. al.Patricia M Schnepp ... Mengjie Chen
27 Aug 2019
Human Molecular Genetics | VOL. 28

Local and Bayesian Survival FDR Estimations to Identify Reliable Associations in Whole Genome of Bread Wheat.
Mohammad Bahman Sadeqi ... Agim Ballvora
International journal of molecular sciences | VOL. 24
Mohammad Bahman Sadeqi, et. al.Mohammad Bahman Sadeqi ... Agim Ballvora
12 Sep 2023
International journal of molecular sciences | VOL. 24

Whole genome sequencing of 35 individuals provides insights into the genetic architecture of Korean population.
Wenqian Zhang ... Hui Wen Ng
BMC Bioinformatics | VOL. Suppl 15 11
Wenqian Zhang, et. al.Wenqian Zhang ... Hui Wen Ng
21 Oct 2014
BMC Bioinformatics | VOL. Suppl 15 11

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Empirical Bayes single nucleotide variant-calling for next-generation sequencing data

Abstract

Talk to us

Similar Papers

More From: Scientific Reports