Abstract

Many lives have been lost due to genetic diseases and the inbility to identify them. The genetic disorder is mainly because of the alteration in the common DNA nucleotide sequence, where benign and pathogenic are the common examples of these genetic variants. Deliberate changes for the gene mutation may cause unexpected results; at times, required results occur though it is highly likely to get unexpected outcomes too. In this paper, two unsupervised deep learning classification methods have been proposed to classify these genetic changes. For this work, Self-Organizing Map (SOM) and Autoencoder models have been used. SOM is an unsupervised learning technique used to obtain a low dimensional representation of the data. The SOM has been implemented using MiniSOM library. Autoencoder comprises an encoder and decoder component. The information encoded by encoder is decoded using the decoder component to obtain as close representation to the input as possible. The analysis were performed on ClinVar dataset comprising 6 lac records. The dataset is publicly available. The data was first subjected to pre-processing to handle missing and duplicate values. The result showed the good performance of Autoencoder, where its accuracy is 97% (on Test Data), and SOM has an accuracy of 96% (on Test Data). It has been concluded that unsupervised deep learning models, SOM and Autoencoder, retain enough prediction power to classify and identify if the underline alternation in the gene gives positives changes or the contrary

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call