Abstract

Parkinson’s disease (PD) genes identification plays an important role in improving the diagnosis and treatment of the disease. A number of machine learning methods have been proposed to identify disease-related genes, but only few of these methods are adopted for PD. This work puts forth a novel neural network-based ensemble (n-semble) method to identify Parkinson’s disease genes. The artificial neural network is trained in a unique way to ensemble the multiple model predictions. The proposed n-semble method is composed of four parts: (1) protein sequences are used to construct feature vectors using physicochemical properties of amino acid; (2) dimensionality reduction is achieved using the t-Distributed Stochastic Neighbor Embedding (t-SNE) method, (3) the Jaccard method is applied to find likely negative samples from unknown (candidate) genes, and (4) gene prediction is performed with n-semble method. The proposed n-semble method has been compared with Smalter’s, ProDiGe, PUDI and EPU methods using various evaluation metrics. It has been concluded that the proposed n-semble method outperforms the existing gene identification methods over the other methods and achieves significantly higher precision, recall and F Score of 88.9%, 90.9% and 89.8%, respectively. The obtained results confirm the effectiveness and validity of the proposed framework.

Highlights

  • Parkinson’s disease (PD) was first described by Dr James Parkinson as a ‘‘shaking palsy’’ in 1817 [1]

  • The curve of the n-semble method is above the selected methods, which proves that the gain in F Score is robust, that is, independent of the samples obtained from the dataset

  • To specify the conditions under which a classification method outperforms other classifiers is a key question in machine learning

Read more

Summary

Introduction

Parkinson’s disease (PD) was first described by Dr James Parkinson as a ‘‘shaking palsy’’ in 1817 [1]. It is the second most common disease after Alzheimer’s, most prevalent among the elderly. We have introduced a novel n-semble method to identify Parkinson’s disease genes. Geary autocorrelation (GA), Moran autocorrelation (MA) and normalized Moreau– Broto autocorrelation (NA) representation methods on the basis of physicochemical properties of amino acids are Neural Computing and Applications performance measures and a neural network-based ensemble model was put forth. 5. The performance of the proposed n-semble method was analysed using parameters like precision, recall and F Score, and the comparative study was conducted to show the effectiveness of the proposed model

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call