In both industrial applications and basic research the manipulation of protein stability is essential for knowing the principles which govern protein thermostability. This leads to hotspot in data mining based protein engineering and stability prediction. There are so many works related to the prediction of protein stability but they all lack in data preprocessing, presence of duplicates in the dataset and ability to handle uncertainty present in them. The main aim of this paper is to enhance the quality of the protein stability dataset and to increase the accuracy rate of prediction system. For deduplication process fuzzy K-means (FKM)based clustering is applied to cluster and match the duplicate records and eradicate them. To handle the uncertainty Fuzzy Artificial Neural Network (FANN) is used to perform prediction on protein stability. Simulation results proved the efficiency of FKM-FANN which yields excellent results comparing the existing methods
Read full abstract