Distributed ICSA Clustering Approach for Large Scale Protein Sequences and Cancer Diagnosis

Thenmozhi K,Shanthi S,Karthikeyani Visalakshi N,Pyingkodi M

doi:10.31557/apjcp.2018.19.11.3105

Thenmozhi K, Shanthi S + Show 2 more

Open Access

https://doi.org/10.31557/apjcp.2018.19.11.3105

Copy DOI

Abstract

Objective:With the over saturating growth of biological sequence databases, handling of these amounts of data has increasingly become a problem. Clustering has become one of the principal research objectives in structural and functional genomics. However, exact clustering algorithms, such as partitioned and hierarchical clustering, scale relatively poorly in terms of run time and memory usage with large sets of sequences.Methods:From these performance limits, heuristic optimizations such as Cuckoo Search Algorithm with genetic operators (ICSA) algorithm have been implemented in distributed computing environment. The proposed ICSA, a global optimized algorithm that can cluster large numbers of protein sequences by running on distributed computing hardware.Results:It allocates both memory and computing resources efficiently. Compare with the latest research results, our method requires only 15% of the execution time and obtains even higher quality information of protein sequence.Conclusion:From the experimental analysis, We noticed that the cluster of large protein sequence data sets using ICSA technique instead of only alignment methods reduce extremely the execution time and improve the efficiency of this important task in molecular biology. Moreover, the new era of proteomics is providing us with extensive knowledge of mutations and other alterations in cancer study.

Highlights

Widely adopted paradigm in cancer diagnosis and treatment is that early cancer detection which increases likelihood of survival (Zhu et al, 2011)
From the experimental analysis, We noticed that the cluster of large protein sequence data sets using Improved Cuckoo Search Algorithm (ICSA) technique instead of only alignment methods reduce extremely the execution time and improve the efficiency of this important task in molecular biology
An experiment conducted on large-scale protein data base to show the success of the novel proposed ICSA algorithm

Summary

Introduction

Widely adopted paradigm in cancer diagnosis and treatment is that early cancer detection which increases likelihood of survival (Zhu et al, 2011). Protein sequence analysis gives big chance for diagnosing, stratifying, and monitoring disease. This analysis must meet certain needs, in order to be clinically useful. Schloss et al, (2009) developed software package that allows users to use a single piece of software to analyze community sequence data This method provides to user to screen, trim, assign sequences; operational taxonomic units. The existing implementations such as HPC-CLUST (Matias and Mering, 2014), CD-HIT (Li and Godzik, 2006), MOTHUR (Schloss et al, 2009), ESPRIT (Sun et al, 2009) or RDP online clustering (Cole et al, 2009), all struggle with large sets of sequences

Materials and Methods

Results

Methods

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Asian Pacific journal of cancer prevention : APJCP	Publication Date: Nov 1, 2018
Citations: 3	License type: cc-by

R Discovery Prime

R Discovery Prime

Distributed ICSA Clustering Approach for Large Scale Protein Sequences and Cancer Diagnosis

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Asian Pacific journal of cancer prevention : APJCP

Lead the way for us

Similar Papers

Protein Sequence Annotation Tool (PSAT): a centralized web-based meta-server for high-throughput sequence annotations.
Elo Leung ... Jan Lorenz Soliman
BMC Bioinformatics | VOL. 17
Elo Leung, et. al.Elo Leung ... Jan Lorenz Soliman
20 Jan 2016
BMC Bioinformatics | VOL. 17

ProteoMix: an integrated and flexible system for interactively analyzing large numbers of protein sequences.
Eisuke Chikayama ... Atsushi Kurotani
Bioinformatics | VOL. 20
Eisuke Chikayama, et. al.Eisuke Chikayama ... Atsushi Kurotani
22 Apr 2004
Bioinformatics | VOL. 20

A modified two-stage Markov clustering algorithm for large and sparse networks
László Szilágyi ... Sándor M Szilágyi
Computer Methods and Programs in Biomedicine | VOL. 135
László Szilágyi, et. al.László Szilágyi ... Sándor M Szilágyi
12 Jul 2016
Computer Methods and Programs in Biomedicine | VOL. 135

Using Cuckoo Search Algorithm with Q-Learning and Genetic Operation to Solve the Problem of Logistics Distribution Center Location
Juan Li ... Hong Lei
Mathematics | VOL. 8
Juan Li, et. al.Juan Li ... Hong Lei
21 Jan 2020
Mathematics | VOL. 8

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Distributed ICSA Clustering Approach for Large Scale Protein Sequences and Cancer Diagnosis

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Asian Pacific journal of cancer prevention : APJCP