Abstract

The goal of quantitative structure activity relationship (QSAR) learning is to learn a function that, given the structure of a small molecule (a potential drug), outputs the predicted activity of the compound. We employed multi-task learning (MTL) to exploit commonalities in drug targets and assays. We used datasets containing curated records about the activity of specific compounds on drug targets provided by ChEMBL. Totally, 1091 assays have been analysed. As a baseline, a single task learning approach that trains random forest to predict drug activity for each drug target individually was considered. We then carried out feature-based and instance-based MTL to predict drug activities. We introduced a natural metric of evolutionary distance between drug targets as a measure of tasks relatedness. Instance-based MTL significantly outperformed both, feature-based MTL and the base learner, on 741 drug targets out of 1091. Feature-based MTL won on 179 occasions and the base learner performed best on 171 drug targets. We conclude that MTL QSAR is improved by incorporating the evolutionary distance between targets. These results indicate that QSAR learning can be performed effectively, even if little data is available for specific drug targets, by leveraging what is known about similar drug targets.

Highlights

  • Introduction and problem specification RichCaruana in his widely cited paper defined multi-task learning (MTL) as “an approach to inductive transfer that improves generalization by using the domain information contained in the training signals of related tasks as an inductive bias

  • We only considered groups or classes that have more than one drug target, because otherwise there would be no difference with single task learning (STL), and only included drug targets for which the minimum size of their dataset was 10, because we employ tenfold cross-validation

  • Conclusions and future work We have shown that MTL can significantly improve the performance of quantitative structure activity relationship (QSAR) learning models, and can help to better predict the activity of drugs against specific drug targets

Read more

Summary

Introduction

Caruana in his widely cited paper defined multi-task learning (MTL) (see the list of Abbreviations below) as “an approach to inductive transfer that improves generalization by using the domain information contained in the training signals of related tasks as an inductive bias. It does this by learning tasks in parallel while using a shared representation; what is learned for each task can help other tasks be learned better” [1]. 2. Parameter-based MTL models aim to encode the task relatedness into the learning model via the regularization or prior on model parameters. Instance-based MTL models propose to use data instances from all the tasks to construct a learner for each task via instance weighting.

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call