Abstract
Non-synonymous single-nucleotide polymorphisms (nsSNPs) is a typical kind of genetic variant, and more than 6000 diseases have been detected to be caused by nsSNPs. Accordingly, the accurate prediction of nsSNPs is of great importance for a better understanding of their functional mechanisms and disease treatment. Till now, many computational studies have been developed to identify disease-causing nsSNPs from the neutral ones; however, there is still some gap existing for further improvement in terms of overall prediction performance. In this work, we proposed a novel deep learning model, called multi-scale convolutional neural network (MSCNN). It utilized multi-scale convolution with different kernel sizes for feature processing, which can collect more effective characteristics than using a single convolution kernel size. Moreover, we applied three types of nominal structural features for further improving the nsSNPs prediction performance. Notably, the nsSNPs sequence and structural features were extracted based on the “residue environment” method we proposed, which has proved to be effective for protein nsSNPs prediction in our previous research. Based on the proposed MSCNN model and the extracted informative feature matrix, we implemented a new nsSNPs predictor, named DeepnsSNPs. The DeepnsSNPs was tested on three nsSNPs datasets collected from the PredictSNP1 website and achieved an average Matthews correlation coefficient of 0.507, which is 18.28% higher than the individual classifiers and 11.37% higher than the consensus classifier on average. Detailed dataset analyses have demonstrated that the DeepnsSNPs would be useful in the nsSNPs prediction. We provide the source python codes and benchmark datasets at https://github.com/sera616/DeepnsSNPs.git for academic use.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.