Abstract

Viruses and bacteria are constantly evolving in the world. Early identification of pathogens is one way that can be used to spread the spread of disease to drug design. DNA sequence classification is an essential aspect of computational biology. Pathogen identification was carried out by comparing data between sequenced genomes with NCBI data. Machine learning technology can classify DNA whose nature is unclear, and the sequence is considered long and challenging to find. The SVM classification model is proposed in this study. The resulting accuracy is still considered not optimal, so optimization is needed. In contrast to previous studies, we used the grid search cv optimization technique on the SVM classification model. Kernel polynomial with 2 degrees is the best parameter recommendation from the grid search cv technique. The accuracy before the optimization is 77%, while it is 90% after optimization. This shows an increase in accuracy of 14% after applying the grid search cv method to DNA sequence classification using the SVM model.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.