NCMHap: a novel method for haplotype reconstruction based on Neutrosophic c-means clustering

Fatemeh Zamani,Mohammad Hossein Olyaee,Alireza Khanteymoori

doi:10.1186/s12859-020-03775-0

Fatemeh Zamani, Mohammad Hossein Olyaee + Show 1 more

Open Access

https://doi.org/10.1186/s12859-020-03775-0

Copy DOI

Journal: BMC bioinformatics	Publication Date: Oct 22, 2020
Citations: 2	License type: open-access

Affiliation: University of Zanjan, University of Gonabad

Abstract

BackgroundSingle individual haplotype problem refers to reconstructing haplotypes of an individual based on several input fragments sequenced from a specified chromosome. Solving this problem is an important task in computational biology and has many applications in the pharmaceutical industry, clinical decision-making, and genetic diseases. It is known that solving the problem is NP-hard. Although several methods have been proposed to solve the problem, it is found that most of them have low performances in dealing with noisy input fragments. Therefore, proposing a method which is accurate and scalable, is a challenging task.ResultsIn this paper, we introduced a method, named NCMHap, which utilizes the Neutrosophic c-means (NCM) clustering algorithm. The NCM algorithm can effectively detect the noise and outliers in the input data. In addition, it can reduce their effects in the clustering process. The proposed method has been evaluated by several benchmark datasets. Comparing with existing methods indicates when NCM is tuned by suitable parameters, the results are encouraging. In particular, when the amount of noise increases, it outperforms the comparing methods.ConclusionThe proposed method is validated using simulated and real datasets. The achieved results recommend the application of NCMHap on the datasets which involve the fragments with a huge amount of gaps and noise.

Highlights

Single individual haplotype problem refers to reconstructing haplotypes of an individual based on several input fragments sequenced from a specified chromosome
By evaluating the results of NCMHap on both simulated and real datasets, we found that the proposed approach could effectively overcome the challenge of the occurrence of noise in the input fragments, and could provide promising results compared with current methods
In this paper, we presented a method based on the Neutrosophic c-means (NCM) clustering algorithm for haplotype assembly problem

Summary

Introduction

Single individual haplotype problem refers to reconstructing haplotypes of an individual based on several input fragments sequenced from a specified chromosome. Solving this problem is an important task in computational biology and has many applications in the pharmaceutical industry, clinical decision-making, and genetic diseases. Several methods have been proposed to solve the problem, it is found that most of them have low performances in dealing with noisy input fragments. It has been revealed that the human genome shows some degrees of inter-individual and inter-population variations which make it an appropriate target to rigorous functional genomic analysis [1, 2]. Since each haplotype is derived from a copy of a specific chromosome, as a result, there are two copies of haplotypes

Methods

Results

Discussion

Conclusion