Abstract

BackgroundBatch effects were not accounted for in most of the studies of computational drug repositioning based on gene expression signatures. It is unknown how batch effect removal methods impact the results of signature-based drug repositioning. Herein, we conducted differential analyses on the Connectivity Map (CMAP) database using several batch effect correction methods to evaluate the influence of batch effect correction methods on computational drug repositioning using microarray data and compare several batch effect correction methods.ResultsDifferences in average signature size were observed with different methods applied. The gene signatures identified by the Latent Effect Adjustment after Primary Projection (LEAPP) method and the methods fitted with Linear Models for Microarray Data (limma) software demonstrated little agreement. The external validity of the gene signatures was evaluated by connectivity mapping between the CMAP database and the Library of Integrated Network-based Cellular Signatures (LINCS) database. The results of connectivity mapping indicate that the genes identified were not reliable for drugs with total sample size (drug + control samples) smaller than 40, irrespective of the batch effect correction method applied. With total sample size larger than 40, the methods correcting for batch effects produced significantly better results than the method with no batch effect correction. In a simulation study, the power was generally low for simulated data with sample size smaller than 40. We observed best performance when using the limma method correcting for two principal components.ConclusionBatch effect correction methods strongly impact differential gene expression analysis when the sample size is large enough to contain sufficient information and thus the downstream drug repositioning. We recommend including two or three principal components as covariates in fitting models with limma when sample size is sufficient (larger than 40 drug and controls combined).

Highlights

  • Batch effects were not accounted for in most of the studies of computational drug repositioning based on gene expression signatures

  • We aim to investigate the influence of batch effect removal methods on computational drug repositioning focusing on microarray data, using the example of the Connectivity Map (CMAP) dataset, since this is still the primary source of drug gene expression signatures

  • We evaluate the quality of the gene signatures generated by these methods by gene set enrichment analyses on the shared drugs between the CMAP database and the Library of Integrated Network-based Cellular Signatures (LINCS) database (Fig. 1a)

Read more

Summary

Introduction

Batch effects were not accounted for in most of the studies of computational drug repositioning based on gene expression signatures. A popular approach is to identify new indications for drugs based on their gene signature showing an opposite pattern of up−/down-regulation as compared to a disease signature [4] This approach was piloted by the Connectivity Map (CMAP) project, in which a pattern matching algorithm was employed to rank the similarities between the query signature and the compound profiles called reference signatures [5]. Sirota et al integrated 164 drug compounds from CMAP and 100 diseases to predict novel therapeutic indications on signatures in drug-disease pairs, which have led to the discovery of cimetidine as a candidate treatment for lung adenocarcinoma [7] As another example, Van Noort et al utilized the gene expression profiles of more than 1000 drugs from CMAP and applied the inverse signature approach to identify anti-metastatic drugs for the treatment of colorectal cancer [11]. The follow-up database to CMAP is the Library of Integrated Network-based Cellular Signatures (LINCS) L1000 database [12], which has been recently used in signature-based drug repositioning [13]

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call