Pairwise Alignment Algorithm Research Articles

BackgroundMultiple genome alignment is an important problem in bioinformatics. An important subproblem used by many multiple alignment approaches is that of aligning two multiple alignments. Many popular alignment algorithms for DNA use the sum-of-pairs heuristic, where the score of a multiple alignment is the sum of its induced pairwise alignment scores. However, the biological meaning of the sum-of-pairs of pairs heuristic is not obvious. Additionally, many algorithms based on the sum-of-pairs heuristic are complicated and slow, compared to pairwise alignment algorithms.An alternative approach to aligning alignments is to first infer ancestral sequences for each alignment, and then align the two ancestral sequences. In addition to being fast, this method has a clear biological basis that takes into account the evolution implied by an underlying phylogenetic tree.In this study we explore the accuracy of aligning alignments by ancestral sequence alignment. We examine the use of both maximum likelihood and parsimony to infer ancestral sequences. Additionally, we investigate the effect on accuracy of allowing ambiguity in our ancestral sequences.ResultsWe use synthetic sequence data that we generate by simulating evolution on a phylogenetic tree. We use two different types of phylogenetic trees: trees with a period of rapid growth followed by a period of slow growth, and trees with a period of slow growth followed by a period of rapid growth.We examine the alignment accuracy of four ancestral sequence reconstruction and alignment methods: parsimony, maximum likelihood, ambiguous parsimony, and ambiguous maximum likelihood. Additionally, we compare against the alignment accuracy of two sum-of-pairs algorithms: ClustalW and the heuristic of Ma, Zhang, and Wang.ConclusionWe find that allowing ambiguity in ancestral sequences does not lead to better multiple alignments. Regardless of whether we use parsimony or maximum likelihood, the success of aligning ancestral sequences containing ambiguity is very sensitive to the choice of gap open cost. Surprisingly, we find that using maximum likelihood to infer ancestral sequences results in less accurate alignments than when using parsimony to infer ancestral sequences. Finally, we find that the sum-of-pairs methods produce better alignments than all of the ancestral alignment methods.

Read full abstract

The PRECISE database was developed by our laboratory to allow for the systematic study of the ligand interactions common to a set of functionally related enzymes, where an interaction site is defined broadly as any residue(s) that interact with a ligand. During the construction of PRECISE, enzyme chains are extracted from the protein data bank (PDB) and clustered according to functional homology as defined by the enzyme commission (EC) nomenclature system. A sequence representative is chosen from each cluster based on the criterion set forth by the non-redundant PDB set, and pair-wise alignments of each cluster member to the representative are performed. Atom-based residue–ligand interactions are calculated for each cluster member, and the summation of ligand interactions for all cluster members at each aligned position is determined. Although we were able to successfully align most clusters using a simple dynamic programming algorithm, several cluster created exhibited poor pair-wise alignments of each cluster member to its sequence representative. We hypothesized that the observed alignment problems were, in most cases, due to the incorrect separation and alignment of different domains in multi-domain proteins, a mistake that frequently causes error proliferation in functional annotation. Here we present the results of generating primary sequence patterns for each poorly aligned cluster in PRECISE to assess the extent to which multi-domain proteins that are incorrectly aligned contributes to poor pair-wise alignments of each cluster member to its representative. This requires the use of an iterative locally optimal pair-wise alignment algorithm to build a hierarchical similarity-based sequence pattern for a set of functionally related enzymes. Our results show that poor alignments in PRECISE are caused most frequently by the misalignment of multi-domain proteins, and that the generation of primary sequence patterns for the assignment of sequence family membership yields better alignments for the functionally related enzyme clusters in PRECISE than our original alignment algorithm.

Read full abstract

Pairwise Alignment Algorithm Research Articles

Related Topics

Articles published on Pairwise Alignment Algorithm

Flexible Alignment of Small Molecules Using the Penalty Method

Assessment of microbial communities by graph partitioning in a study of soil fungi in two Alpine meadows.

Elucidating Molecular Overlays from Pairwise Alignments Using a Genetic Algorithm

Substrate Requirements for SPPL2b-dependent Regulated Intramembrane Proteolysis

An efficient genetic algorithm for structural RNA pairwise alignment and its application to non-coding RNA discovery in yeast

Improved accuracy of multiple ncRNA alignment by incorporating structural information into a MAFFT-based framework

CLePAPS: FAST PAIR ALIGNMENT OF PROTEIN STRUCTURES BASED ON CONFORMATIONAL LETTERS

Fast Pairwise Structural RNA Alignments by Pruning of the Dynamical Programming Matrix

Constrained sequence alignment: A general model and the hardness results

DIAL: a web server for the pairwise alignment of two RNA three-dimensional structures using nucleotide, dihedral angle and base-pairing similarities

Fold Recognition via a Tree

기능 도메인 예측을 위한 유전자 서열 클러스터링

MELDB: A database for microbial esterases and lipases

A structurally‐defined gap function for pairwise sequence alignment of proteins in the twilight zone

Ancestral sequence alignment under optimal conditions

Clustering of domains of functionally related enzymes in the interaction database PRECISE by the generation of primary sequence patterns

Multiple-Ligand-Based Virtual Screening: Methods and Applications of the MTree Approach

Pairwise alignment incorporating dipeptide covariation

A gene clustering method with masking cross-matching fragments using modified suffix tree clustering method

A Pairwise Alignment Algorithm Which Favors Clusters of Blocks

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Pairwise Alignment Algorithm Research Articles

Related Topics

Articles published on Pairwise Alignment Algorithm

Flexible Alignment of Small Molecules Using the Penalty Method

Assessment of microbial communities by graph partitioning in a study of soil fungi in two Alpine meadows.

Elucidating Molecular Overlays from Pairwise Alignments Using a Genetic Algorithm

Substrate Requirements for SPPL2b-dependent Regulated Intramembrane Proteolysis

An efficient genetic algorithm for structural RNA pairwise alignment and its application to non-coding RNA discovery in yeast

Improved accuracy of multiple ncRNA alignment by incorporating structural information into a MAFFT-based framework

CLePAPS: FAST PAIR ALIGNMENT OF PROTEIN STRUCTURES BASED ON CONFORMATIONAL LETTERS

Fast Pairwise Structural RNA Alignments by Pruning of the Dynamical Programming Matrix

Constrained sequence alignment: A general model and the hardness results

DIAL: a web server for the pairwise alignment of two RNA three-dimensional structures using nucleotide, dihedral angle and base-pairing similarities

Fold Recognition via a Tree

기능 도메인 예측을 위한 유전자 서열 클러스터링

MELDB: A database for microbial esterases and lipases

A structurally‐defined gap function for pairwise sequence alignment of proteins in the twilight zone

Ancestral sequence alignment under optimal conditions

Clustering of domains of functionally related enzymes in the interaction database PRECISE by the generation of primary sequence patterns

Multiple-Ligand-Based Virtual Screening: Methods and Applications of the MTree Approach

Pairwise alignment incorporating dipeptide covariation

A gene clustering method with masking cross-matching fragments using modified suffix tree clustering method

A Pairwise Alignment Algorithm Which Favors Clusters of Blocks