Abstract
Deep learning is developing as an important technology to perform various tasks in cheminformatics. In particular, graph convolutional neural networks (GCNs) have been reported to perform well in many types of prediction tasks related to molecules. Although GCN exhibits considerable potential in various applications, appropriate utilization of this resource for obtaining reasonable and reliable prediction results requires thorough understanding of GCN and programming. To leverage the power of GCN to benefit various users from chemists to cheminformaticians, an open-source GCN tool, kGCN, is introduced. To support the users with various levels of programming skills, kGCN includes three interfaces: a graphical user interface (GUI) employing KNIME for users with limited programming skills such as chemists, as well as command-line and Python library interfaces for users with advanced programming skills such as cheminformaticians. To support the three steps required for building a prediction model, i.e., pre-processing, model tuning, and interpretation of results, kGCN includes functions of typical pre-processing, Bayesian optimization for automatic model tuning, and visualization of the atomic contribution to prediction for interpretation of results. kGCN supports three types of approaches, single-task, multi-task, and multi-modal predictions. The prediction of compound-protein interaction for four matrixmetalloproteases, MMP-3, -9, -12 and -13, in the inhibition assays is performed as a representative case study using kGCN. Additionally, kGCN provides the visualization of atomic contributions to the prediction. Such visualization is useful for the validation of the prediction models and the design of molecules based on the prediction model, realizing “explainable AI” for understanding the factors affecting AI prediction. kGCN is available at https://github.com/clinfo.
Highlights
Deep learning is emerging as an important technology to perform various tasks in cheminformatics [1,2,3].With the recent development of artificial intelligence (AI) and deep learning, the application of deep learning approaches has been practically demonstrated for various predictions such as virtual screening [4], quantitative structure-activity relationship (QSAR) studies [5], and ADMET prediction [6, 7]
The applicability of kGCN to compoundprotein interactions (CPIs) prediction is demonstrated as an example of single-task/multi-task/multi-modal graph convolutional neural networks (GCNs)
The singletask GCN predicts the activity against a protein based on the chemical structure represented as a graph
Summary
Deep learning is emerging as an important technology to perform various tasks in cheminformatics [1,2,3]. With the recent development of artificial intelligence (AI) and deep learning, the application of deep learning approaches has been practically demonstrated for various predictions such as virtual screening [4], quantitative structure-activity relationship (QSAR) studies [5], and ADMET (absorption, distribution, metabolism elimination, and toxicology) prediction [6, 7]. The accessibility of deep learning to non-experts is an important issue in the field of cheminformatics. As deep learning can be applied to a wide range of research areas in drug discovery such as ADMET predictions for lead optimization and virtual screening for lead identification, the chemists should be able to solve these research problems by using the latest technologies and analyze the results, availing the benefits of deep learning. As chemists are typically not proficient in deep learning, the development of easy-to-use, multi-functional deep learning software is necessary
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.