Abstract
Bioactive compounds are often used as initial substances for many therapeutic agents. In recent years, both theoretical and practical innovations in hardware-assisted and fast-evolving machine learning (ML) have made it possible to identify desired bioactive compounds in chemical spaces, such as those in natural products (NPs). This review introduces how machine learning approaches can be used for the identification and evaluation of bioactive compounds. It also provides an overview of recent research trends in machine learning-based prediction and the evaluation of bioactive compounds by listing real-world examples along with various input data. In addition, several ML-based approaches to identify specific bioactive compounds for cardiovascular and metabolic diseases are described. Overall, these approaches are important for the discovery of novel bioactive compounds and provide new insights into the machine learning basis for various traditional applications of bioactive compound-related research.
Highlights
Bioactive compounds are chemicals that exist in trace amounts in natural products (NPs) from plants and animals
Many approved drugs are structurally similar to bioactive compounds [8], providing a clue that finding bioactive compound analogs may be the right strategy for novel drug development
machine learning (ML)-based approaches typically achieve high performance for classification or regression tasks, there are still some limitations to be widely applied in NP-related research
Summary
Bioactive compounds are chemicals that exist in trace amounts in natural products (NPs) from plants and animals. Another obstacle to applying machine learning technology to gene expression data is that individual gene expression data have been processed in different analysis pipelines, so unknown bias (noise) that is difficult to identify, such as batch effects, different sequencing devices, and/or various experimental conditions, is inherent to the data To minimize this bias and provide vast transcriptome resources in a user-friendly manner, a recent study constructed a transcriptome database called ARCHS4 (https://maayanlab.cloud/archs4/ (accessed on 10 December 2021)), comprising more than 200,000 human and mouse transcripts processed through a uniform analysis pipeline [41]. Advances in cheminformatics, bioinformatics, and a variety of publicly accessible databases have emerged rapidly over the past decade, accelerating the process of interconnecting chemistry, biology, and drug development This system will be used to expand our limited understanding of chemicals and build promising ML-based models that generate novel bioactive-like compounds, with high efficacy and low toxicity, against various diseases, including cardiovascular and metabolic diseases
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.