Pairwise statistical significance of local sequence alignment using multiple parameter sets and empirical justification of parameter set change penalty

Ankit Agrawal,Xiaoqiu Huang

doi:10.1186/1471-2105-10-s3-s1

Abstract

BackgroundAccurate estimation of statistical significance of a pairwise alignment is an important problem in sequence comparison. Recently, a comparative study of pairwise statistical significance with database statistical significance was conducted. In this paper, we extend the earlier work on pairwise statistical significance by incorporating with it the use of multiple parameter sets.ResultsResults for a knowledge discovery application of homology detection reveal that using multiple parameter sets for pairwise statistical significance estimates gives better coverage than using a single parameter set, at least at some error levels. Further, the results of pairwise statistical significance using multiple parameter sets are shown to be significantly better than database statistical significance estimates reported by BLAST and PSI-BLAST, and comparable and at times significantly better than SSEARCH. Using non-zero parameter set change penalty values give better performance than zero penalty.ConclusionThe fact that the homology detection performance does not degrade when using multiple parameter sets is a strong evidence for the validity of the assumption that the alignment score distribution follows an extreme value distribution even when using multiple parameter sets. Parameter set change penalty is a useful parameter for alignment using multiple parameter sets. Pairwise statistical significance using multiple parameter sets can be effectively used to determine the relatedness of a (or a few) pair(s) of sequences without performing a time-consuming database search.

Highlights

Accurate estimation of statistical significance of a pairwise alignment is an important problem in sequence comparison
It is clear that the proposed pairwise statistical significance using multiple parameter sets performs significantly better than BLAST and PSI-BLAST at all error levels, comparable to SSEARCH at low error levels, and significantly better than SSEARCH at higher error levels
The results show that PSI-BLAST gave poorer performance than pairwise statistical significance using multiple parameter sets, even with position-specific scoring matrices (PSSMs) constructed against the benchmark CATH database used in our experiments

Summary

Introduction

Accurate estimation of statistical significance of a pairwise alignment is an important problem in sequence comparison. Local sequence alignment plays a major role in the analysis of DNA and protein sequences [1,2,3]. It is the basic step of many other applications like detecting homology, finding protein structure and function, deciphering evolutionary relationships, etc. Since the alignment score distribution depends on various factors like alignment program, scoring scheme, sequence lengths, sequence compositions [10], it implies that it is possible to have two alignments of different sequence pairs with scores x and y with x

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Bioinformatics	Publication Date: Mar 1, 2009
Citations: 55	License type: CC BY 2.0

R Discovery Prime

R Discovery Prime

Pairwise statistical significance of local sequence alignment using multiple parameter sets and empirical justification of parameter set change penalty

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

Pairwise statistical significance of local sequence alignment using multiple parameter sets
Ankit Agrawal ... Xiaoqiu Huang
-
Ankit Agrawal, et. al.Ankit Agrawal ... Xiaoqiu Huang
30 Oct 2008
30 Oct 2008

Conservative, Non-conservative and Average Pairwise Statistical Significance of Local Sequence Alignment
Ankit Agrawal ... Xiaoqiu Huang
-
Ankit Agrawal, et. al.Ankit Agrawal ... Xiaoqiu Huang
01 Jan 2008
01 Jan 2008

Tuning of Multiple Parameter Sets in Evolutionary Algorithms
Martin Andersson ... Sunith Bandaru
-
Martin Andersson, et. al.Martin Andersson ... Sunith Bandaru
20 Jul 2016
20 Jul 2016

MPIPairwiseStatSig
Ankit Agrawal ... Sanchit Misra
-
Ankit Agrawal, et. al.Ankit Agrawal ... Sanchit Misra
21 Jun 2010
21 Jun 2010

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Pairwise statistical significance of local sequence alignment using multiple parameter sets and empirical justification of parameter set change penalty

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics