Abstract

A huge amount of newly sequenced proteins is being discovered on daily basis. The mainconcern is how to extract the useful characteristics of sequences as the input features for thenetwork. These sequences are increasing exponentially over the decades. However, it is veryexpensive to characterize functions for biological experiments and also, it is really necessaryto find the association between the information of datasets to create and improve medicaltools. Recently machine learning algorithms got huge attention and are widely used. Thesealgorithms are based on deep learning architecture and data-driven models. Previous workfailed to properly address issues related to the classification of biological sequences i.e.protein including efficient encoding of variable length biological sequence data andimplementation of deep learning based neural network models to enhance the performance ofclassification/ recognition systems. To overcome these issues, we have proposed a deeplearning based neural network architecture so that classification performance of the systemcan be increased. In our work, we have proposed 1D-convolution neural network whichclassifies the protein sequences to 10 top common classes. The model extracted features fromthe protein sequences labels and learned through the dataset. We have trained and evaluateour model on protein sequences downloaded from protein data bank (PDB). The modelmaximizes the accuracy rate up to 96%.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call