Abstract

This paper presents an experimental investigation to determine the efficacy and the appropriate order of Frequency Chaos Game Representation (FCGR) for accurate and in silico classification of pathogenic viruses. For this study, we curated genomic sequences of selected viral pathogens from the virus pathogen database and analysis resource corpus. The viral genomes were encoded using the first to seventh order FCGRs so as to produce training and testing genomic data features. Thereafter, four different kernels of naive Bayes classifier were experimentally trained and tested with the generated FCGR genomic features. The performance result with the highest average classification accuracy of 98% was returned by the third and fourth order FCGRs. However, due to consideration for memory utilization, computational efficiency vis-a-vis classification accuracy, the third order FCGR is deemed suitable for accurate classification of viral pathogens from genome sequences. This provides a promising foundation for developing genomic based diagnostic toolkit that could be used to promptly address the global incidence of epidemics from pathogenic viruses.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call