Abstract

The anatomical therapeutic chemical (ATC) classification system maintained by the World Health Organization provides a global standard for the classification of medical substances and serves as a source for drug repurposing research. Nevertheless, it lacks several drugs that are major players in the global drug market. In order to establish classifications for yet unclassified drugs, this paper presents a newly developed approach based on a combination of information extraction (IE) and machine learning (ML) techniques. Most of the information about drugs is published in the scientific articles. Therefore, an IE-based framework is employed to extract terms from free text that express drug's chemical, pharmacological, therapeutic, and systemic effects. The extracted terms are used as features within a ML framework to predict putative ATC class labels for unclassified drugs. The system was tested on a portion of ATC containing drugs with an indication on the cardiovascular system. The class prediction turned out to be successful with the best predictive accuracy of 89.47% validated by a 100-fold bootstrapping of the training set and an accuracy of 77.12% on an independent test set. The presented concept-based classification system outperformed state-of-the-art classification methods based on chemical structure properties.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.