Abstract

Artificial Neural Network (ANN) algorithms have been widely used to analyse genomic data. Single Nucleotide Polymorphisms(SNPs) represent the genetic variations, the most common in the human genome, it has been shown that they are involved in many genetic diseases, and can be used to predict their development. Developing ANN to handle this type of data can be considered as a great success in the medical world. However, the high dimensionality of genomic data and the availability of a limited number of samples can make the learning task very complicated. In this work, we propose a New Neural Network classification method based on input perturbation. The idea is first to use SVD to reduce the dimensionality of the input data and to train a classification network, which prediction errors are then reduced by perturbing the SVD projection matrix. The proposed method has been evaluated on data from individuals with different ancestral origins, the experimental results have shown the effectiveness of the proposed method. Achieving up to 96.23% of classification accuracy, this approach surpasses previous Deep learning approaches evaluated on the same dataset.

Highlights

  • The human genome contains three billion of base pairs, with only 0.1% difference between individuals [1]

  • Single Nucleotide Polymorphisms(SNPs) represent the genetic variations, the most common in the human genome, it has been shown that they are involved in many genetic diseases, and can be used to predict their development

  • We propose a New Neural Network classification method based on input perturbation

Read more

Summary

Introduction

The human genome contains three billion of base pairs, with only 0.1% difference between individuals [1]. The most common type of genetic variations between individuals is called Single Nucleotide Polymorphism (SNP) [2]. An SNP is a change from one base pair to another, which occurs about once every 1000 bases. Most of these SNPs have no impact on human health. Many studies have shown that some of these genetic variations have important biological effects and are involved in many human diseases [3, 4]. SNPs are commonly used to detect genes associated with the development of a disease within families [5]. Soumare et al BioData Mining (2021) 14:30

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call