Abstract

Transcriptomics data are relevant to address a number of challenges in Toxicogenomics (TGx). After careful planning of exposure conditions and data preprocessing, the TGx data can be used in predictive toxicology, where more advanced modelling techniques are applied. The large volume of molecular profiles produced by omics-based technologies allows the development and application of artificial intelligence (AI) methods in TGx. Indeed, the publicly available omics datasets are constantly increasing together with a plethora of different methods that are made available to facilitate their analysis, interpretation and the generation of accurate and stable predictive models. In this review, we present the state-of-the-art of data modelling applied to transcriptomics data in TGx. We show how the benchmark dose (BMD) analysis can be applied to TGx data. We review read across and adverse outcome pathways (AOP) modelling methodologies. We discuss how network-based approaches can be successfully employed to clarify the mechanism of action (MOA) or specific biomarkers of exposure. We also describe the main AI methodologies applied to TGx data to create predictive classification and regression models and we address current challenges. Finally, we present a short description of deep learning (DL) and data integration methodologies applied in these contexts. Modelling of TGx data represents a valuable tool for more accurate chemical safety assessment. This review is the third part of a three-article series on Transcriptomics in Toxicogenomics.

Highlights

  • Clarifying the toxic potential of diverse substances is an important challenge faced by scientists and regulatory authorities alike [1]

  • There are a number of methods for learning directed acyclic graphs (DAGs) available [66], but significant effort is being made to modify them in order to produce meaningful results by analyzing high-dimensional omics data [62,63,64,65,66,67]

  • A more refined POD concentration could be provided by coupling the aforementioned adverse outcome pathways (AOP)-based transcriptomics data analysis with benchmark dose (BMD) modelling to evaluate the dose-response nature of the exposure to ENMs at gene or pathway level

Read more

Summary

Introduction

Clarifying the toxic potential of diverse substances is an important challenge faced by scientists and regulatory authorities alike [1]. A wide range of algorithms has been proposed to build robust and accurate predictive models, including linear and logistic models, support vector machines (SVM), random forests (RF), classification and regression trees (CART), partial least squares discriminant analysis (PLSDA), linear discriminant analysis (LDA), artificial neural networks (ANNs), matrix factorization (MF) and k-nearest neighbours (K-NN) [23,24,25,26] Classic techniques such as linear and logistic models have been the first to be applied in such modelling tasks and can still be considered the methods of choice, especially when analyzing small datasets. We provide a brief overview of data integration methodologies for multi-omics data analyses

Benchmark Dose Modelling
Gene Co-Expression Network Analysis
Read-Across
Adverse Outcome Pathways
Machine Learning in Toxicogenomics
Dimensionality Reduction and Feature Selection
Stability and Applicability Domain
Clustering
Classification
Regression
Model Selection and Hyper-Parameter Optimization
Deep Learning
Data Integration for Multi-Omics Analyses
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call