Phylogenetic Analysis Using MapReduce Programming Model

G.M Siddesh,Eklavya Uppal,Ishank Mishra,Abhinav Anurag,K.G Srinivasa

doi:10.1109/ipdpsw.2015.57

Abstract

Phylogenetic analysis has become essential part of research on the evolutionary tree of life. Distance-matrix methods of phylogenetic analysis explicitly rely on a measure of "genetic distance" between the sequences being classified, and therefore they require multiple sequence alignments as an input. Distance methods attempt to construct an all-to-all matrix from the sequence query set describing the distance between each sequence pair. Dynamic algorithms like Needleman-Wunsch algorithm (NWA) and Smith-Waterman algorithm (SWA) produce accurate alignments, but are computation intensive and are limited to the number and size of the sequences. The paper focuses towards optimizing phylogenetic analysis of large quantities of data using the hadoop Map/Reduce programming model. The proposed approach depends on NWA to produce sequence alignments and neighbor-joining methods, specifically UPGMA (Unweighted Pair Group Method with Arithmetic mean) to produce rooted trees. The experimental results demonstrate that proposed solution achieve significant improvements with respect to performance and throughput. The dynamic nature of the NWA coupled with data and computational parallelism of hadoop MapReduce programming model improves the throughput and accuracy of sequence alignment. Hence the proposed approach intends to carve out a new methodology towards optimizing phylogenetic analysis by achieving significant performance gain.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Phylogenetic Analysis Using MapReduce Programming Model

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Evaluating global and local sequence alignment methods for comparing patient medical records
Ming Huang ... Lixia Yao
BMC Medical Informatics and Decision Making | VOL. 19
Ming Huang, et. al.Ming Huang ... Lixia Yao
01 Dec 2019
BMC Medical Informatics and Decision Making | VOL. 19

PyPaSWAS: Python-based multi-core CPU and GPU sequence alignment.
Sven Warris ... Alexandre G De Brevern
PloS one | VOL. 13
Sven Warris, et. al.Sven Warris ... Alexandre G De Brevern
02 Jan 2018
PloS one | VOL. 13

Temporal Needleman-Wunsch
Haider Syed ... Amar K Das
-
Haider Syed, et. al.Haider Syed ... Amar K Das
01 Oct 2015
01 Oct 2015

Grouping of amino acids and recognition of protein structurally conserved regions by reduced alphabets of amino acids
Jing Li ... Wei Wang
Science in China Series C: Life Sciences | VOL. 50
Jing Li, et. al.Jing Li ... Wei Wang
01 Jun 2007
Science in China Series C: Life Sciences | VOL. 50

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Phylogenetic Analysis Using MapReduce Programming Model

Abstract

Talk to us

Similar Papers