Abstract

This research work focus on the multiple sequence alignment, as developing an exact multiple sequence alignment for different protein sequences is a difficult computational task. In this research, a hybrid algorithm named Bacterial Foraging Optimization-Genetic Algorithm (BFO-GA) algorithm is aimed to improve the multi-objectives and carrying out measures of multiple sequence alignment. The proposed algorithm employs multi-objectives such as variable gap penalty minimization, maximization of similarity and non-gap percentage. The proposed BFO-GA algorithm is measured with various MSA methods such as T-Coffee, Clustal Omega, Muscle, K-Align, MAFFT, GA, ACO, ABC and PSO. The experiments were taken on four benchmark datasets such as BAliBASE 3.0, Prefab 4.0, SABmark 1.65 and Oxbench 1.3 databases and the outcomes prove that the proposed BFO-GA algorithm obtains better statistical significance results as compared with the other well-known methods. This research study also evaluates the practicability of the alignments of BFO-GA by applying the optimal sequence to predict the phylogenetic tree by using ClustalW2 Phylogeny tool and compare with the existing algorithms by using the Robinson-Foulds (RF) distance performance metric. Lastly, the statistical implication of the proposed algorithm is computed by using the Wilcoxon Matched-Pair Signed- Rank test and also it infers better results.

Highlights

  • StudyThe dynamic programming is the basic approach to solve multiple sequence alignment problems

  • The public presentation of the proposed algorithm has been assessed by comparing with several optimization techniques, namely Genetic Algorithm (GA), Ant Colony Optimization (ACO), Artificial Bee Colony (ABC), Particle Swarm Optimization (PSO) and existing online tools namely T-Coffee, Muscle, K-Align, MAFFT and Clustal Omega

  • This research focuses on the performance measures such as the ratio of pairs correctly aligned namely Sum of Pairs (SP), the ratio of the columns correctly aligned namely Total Column Score (TCS) and the multi-objectives such as maximization of similarity, gap penalty and Non-Gap percentage

Read more

Summary

Introduction

The dynamic programming is the basic approach to solve multiple sequence alignment problems. Needleman-Wunsch algorithm is the foremost applications of dynamic programming, and it is applied to compare biological sequences[3]. The dynamic programming is applicable to any number of sequences, it is computationally expensive in both memory and time. Later than a heuristic search known as progressive technique which is likewise identified as hierarchical or tree method is deployed for multiple sequence alignment[4]. The resulting alignments may be reasonable are some of the advantages of the progressive alignment technique. R-Coffee is a web server, which creates highly accurate multiple alignments of non-coding RNA sequences and it is founded on the principle of T-Coffee[6]. The major disadvantage of the progressive technique is the choice of selecting the “most related” sequences

Objectives
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.