Abstract

Motivation In silico approaches often fail to utilize bioactivity data available for orthologous targets due to insufficient evidence highlighting the benefit for such an approach. Deeper investigation into orthologue chemical space and its influence toward expanding compound and target coverage is necessary to improve the confidence in this practice.ResultsHere we present analysis of the orthologue chemical space in ChEMBL and PubChem and its impact on target prediction. We highlight the number of conflicting bioactivities between human and orthologues is low and annotations are overall compatible. Chemical space analysis shows orthologues are chemically dissimilar to human with high intra-group similarity, suggesting they could effectively extend the chemical space modelled. Based on these observations, we show the benefit of orthologue inclusion in terms of novel target coverage. We also benchmarked predictive models using a time-series split and also using bioactivities from Chemistry Connect and HTS data available at AstraZeneca, showing that orthologue bioactivity inclusion statistically improved performance.Availability and implementationOrthologue-based bioactivity prediction and the compound training set are available at www.github.com/lhm30/PIDGINv2.Supplementary information Supplementary data are available at Bioinformatics online.

Highlights

  • In silico deconvolution is a well-established computational technique capable of inferring compound activity using similarity relationships between orphan compounds and identified ligands (Wang et al, 2013)

  • The Lyase target class is dominated by bovine data, which contributes 1 179 of the 1 314 bioactivities from orthologues. 1 101 of these are annotated for ‘Carbonic anhydrase 4’ (CA4), which originate from the popular purification method of CA4 extraction from bovine lung tissue (Scozzafava et al, 2012)

  • We present an in-depth analysis of orthologue bioactivity data and its relevance and applicability towards expanding compound and target bioactivity space for predictive studies

Read more

Summary

Introduction

In silico deconvolution is a well-established computational technique capable of inferring compound activity using similarity relationships between orphan compounds and identified ligands (Wang et al, 2013) Previous approaches often focus within one species, where bioactivity information for a single organism is extracted from bioactivity repositories (Cereto-Massagueet al., 2015; Ivanov et al, 2016). In this situation, annotations for orthologous protein relationships, the closest relative of a given gene in a different species, are disregarded. Since orthologues share functional similarity and are likely to share similar bioactivity profiles, the mapping between species

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.