Abstract

Grid computing has been noticed as an issue to solve complex problems of large-scale bioinformatics applications and helps to improve data accuracy and processing speed on multiple computation platforms. Outlier detection helps classification success rate high and makes processing time reduce. This paper focuses on a data clustering and classification method with outlier detection which is an important bioinformatics application in grid environment. This paper proposes a grid-based and outlier detection-based clustering and classification(GODDCC) using grid computational resources with geographically distributed bioinformatics data sets. This GODDCC is able to operate large-scale bioinformatics applications in guaranteeing high bio-data accuracy with reasonable grid resources. This paper evaluates performance of GODDCC in comparing to the data clustering and classification(DCC) without outlier detection. The average of processing time of the GODDCC model records the lowest processing time and provides the highest resources utilization than the other DCC models. The outlier detection method reduces processing time for DCC models with maintaining high classification success rate and grid computing gives a great promise of high performance processing with geographically distributed and large-scale bio-data sets in bioinformatics applications.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call