Abstract
Each HIV-1 patient has a diverse population of virus strains in his/her body as the virus quickly replicates and mutates, requiring a combination drug therapy optimized to the patient's unique viral population. Towards this goal, prediction systems have been developed to deduce the susceptibility of a given HIV genotype to a single drug. Many are rule-based systems or rely on hand-crafted features which are difficult to update for HIV strains and new drugs. We adapted the vector-of-n-grams approach from document classification and chi-square feature selection to automatically generate a feature set that yields comparable performance to the expert-selected and database-derived feature sets without requiring treatment history data. Our automatically-generated feature set also found all the expert-selected mutations and more demonstrating its potential for knowledge discovery. Compared to the previous state-of-the-art with ample expert knowledge, our best fully-automated prediction model for each drug yielded comparable performance at 82.9% classification accuracy and 0.819 coefficient of determination on average. Along with its lack of need for human expertise and potential for knowledge discovery, our automatic feature selection method is a good candidate for the more complex prediction task of combination drug therapy optimization.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.