Improving Retrieval Efficacy of Homology Searches Using the False Discovery Rate.

Hyrum D Carroll,Anthony G Davis,Alex C Williams,John L Spouge

doi:10.1109/tcbb.2014.2366112

Abstract

Over the past few decades, discovery based on sequence homology has become a widely accepted practice. Consequently, comparative accuracy of retrieval algorithms (e.g., BLAST) has been rigorously studied for improvement. Unlike most components of retrieval algorithms, the E-value threshold criterion has yet to be thoroughly investigated. An investigation of the threshold is important as it exclusively dictates which sequences are declared relevant and irrelevant. In this paper, we introduce the false discovery rate (FDR) statistic as a replacement for the uniform threshold criterion in order to improve efficacy in retrieval systems. Using NCBI's BLAST and PSI-BLAST software packages, we demonstrate the applicability of such a replacement in both non-iterative (BLASTFDR) and iterative (PSI-BLAST(FDR)) homology searches. For each application, we performed an evaluation of retrieval efficacy with five different multiple testing methods on a large training database. For each algorithm, we choose the best performing method, Benjamini-Hochberg, as the default statistic. As measured by the threshold average precision, BLAST(FDR) yielded 14.1 percent better retrieval performance than BLAST on a large (5,161 queries) test database and PSI-BLAST(FDR) attained 11.8 percent better retrieval performance than PSI-BLAST. The C++ source code specific to BLAST(FDR) and PSI-BLAST(FDR) and instructions are available at http://www.cs.mtsu.edu/~hcarroll/blast_fdr/.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Improving Retrieval Efficacy of Homology Searches Using the False Discovery Rate.

Abstract

Talk to us

Similar Papers

More From: IEEE/ACM Transactions on Computational Biology and Bioinformatics

Lead the way for us

Journal: IEEE/ACM Transactions on Computational Biology and Bioinformatics	Publication Date: May 1, 2015
Citations: 3

Similar Papers

False Discovery Rate for Homology Searches
Hyrum D Carroll ... Alex C Williams
-
Hyrum D Carroll, et. al.Hyrum D Carroll ... Alex C Williams
01 Jan 2013
01 Jan 2013

Leveraging biological and statistical covariates improves the detection power in epigenome-wide association testing
Jinyan Huang ... Zhiyin An
Genome Biology | VOL. 21
Jinyan Huang, et. al.Jinyan Huang ... Zhiyin An
06 Apr 2020
Genome Biology | VOL. 21

Further results on controlling the false discovery rate under some complex grouping structure of hypotheses
Shinjini Nandi ... Sanat K Sarkar
Journal of Statistical Planning and Inference | VOL. 229
Shinjini Nandi, et. al.Shinjini Nandi ... Sanat K Sarkar
03 Aug 2023
Journal of Statistical Planning and Inference | VOL. 229

Assessing Differential Expression in Two-Color Microarrays: A Resampling-Based Empirical Bayes Approach
Dongmei Li ... Nisha I Parikh
PLoS ONE | VOL. 8
Dongmei Li, et. al.Dongmei Li ... Nisha I Parikh
27 Nov 2013
PLoS ONE | VOL. 8

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Improving Retrieval Efficacy of Homology Searches Using the False Discovery Rate.

Abstract

Talk to us

Similar Papers

More From: IEEE/ACM Transactions on Computational Biology and Bioinformatics