A Horizontal Alignment Tool for Numerical Trend Discovery in Sequence Data: Application to Protein Hydropathy

Omar Hadzipasic,Vincent J Hilser,James O Wrabl,Jacquelyn S Fetrow

doi:10.1371/journal.pcbi.1003247

Omar Hadzipasic, Vincent J Hilser + Show 2 more

Open Access

https://doi.org/10.1371/journal.pcbi.1003247

Copy DOI

Journal: PLoS Computational Biology	Publication Date: Oct 10, 2013
Citations: 41	License type: CC BY 4.0

Affiliation: Johns Hopkins University

Abstract

An algorithm is presented that returns the optimal pairwise gapped alignment of two sets of signed numerical sequence values. One distinguishing feature of this algorithm is a flexible comparison engine (based on both relative shape and absolute similarity measures) that does not rely on explicit gap penalties. Additionally, an empirical probability model is developed to estimate the significance of the returned alignment with respect to randomized data. The algorithm's utility for biological hypothesis formulation is demonstrated with test cases including database search and pairwise alignment of protein hydropathy. However, the algorithm and probability model could possibly be extended to accommodate other diverse types of protein or nucleic acid data, including positional thermodynamic stability and mRNA translation efficiency. The algorithm requires only numerical values as input and will readily compare data other than protein hydropathy. The tool is therefore expected to complement, rather than replace, existing sequence and structure based tools and may inform medical discovery, as exemplified by proposed similarity between a chlamydial ORFan protein and bacterial colicin pore-forming domain. The source code, documentation, and a basic web-server application are available.

Highlights

Determining the evolutionary relatedness of two protein sequences is most successfully performed by amino acid sequence comparison [1,2,3,4,5]
We have developed a novel tool that discovers significantly similar trends shared between two numerical data sets
Since we are a protein biophysics group, we are most naturally interested in discovering new similarities between proteins, and we have discovered a interesting, statistically significant similarity between a protein unique to Chlamydia and a bacterial pore-forming protein, colicin

Summary

Introduction

Determining the evolutionary relatedness of two protein sequences is most successfully performed by amino acid sequence comparison [1,2,3,4,5]. Similar properties could exist horizontally in a sequence even when recognizable vertical conservation is lost [7] Even if such similarities are due to analogy rather than homology [8], approaches are needed that can augment sequence based analysis by matching patterns that may be independent of amino acid conservation at each position. It may be the case that proteins can be meaningfully characterized by other attributes, such as the energetic contributions to stability [19] or the predicted codon translation efficiency along the mRNA [20,21,22] Such attributes are not accommodated by simple adaptation of current algorithms, largely because the scoring systems for such algorithms are based on positional sequence identity (amino acid substitution matrices) or absolute geometric structural similarity (Euclidean distance)

Methods

Results

Discussion

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Horizontal Alignment Tool for Numerical Trend Discovery in Sequence Data: Application to Protein Hydropathy

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLoS Computational Biology

Lead the way for us

Similar Papers

Improving pairwise sequence alignment accuracy using near-optimal protein sequence alignments
Michael L Sierk ... William R Pearson
BMC Bioinformatics | VOL. 11
Michael L Sierk, et. al.Michael L Sierk ... William R Pearson
22 Mar 2010
BMC Bioinformatics | VOL. 11

Pairwise alignment incorporating dipeptide covariation
G E Crooks ... R E Green
Bioinformatics | VOL. 21
G E Crooks, et. al.G E Crooks ... R E Green
25 Aug 2005
Bioinformatics | VOL. 21

GPU accelerated sequence alignment with traceback for GATK HaplotypeCaller
Shanshan Ren ... Koen Bertels
BMC Genomics | VOL. 20
Shanshan Ren, et. al.Shanshan Ren ... Koen Bertels
01 Apr 2019
BMC Genomics | VOL. 20

VSEARCH: a versatile open source tool for metagenomics.
Torbjørn Rognes ... Ben Nichols
PeerJ | VOL. 4
Torbjørn Rognes, et. al.Torbjørn Rognes ... Ben Nichols
18 Oct 2016
PeerJ | VOL. 4

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Horizontal Alignment Tool for Numerical Trend Discovery in Sequence Data: Application to Protein Hydropathy

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLoS Computational Biology