Abstract

BackgroundGene expression connectivity mapping has gained much popularity recently with a number of successful applications in biomedical research testifying its utility and promise. Previously methodological research in connectivity mapping mainly focused on two of the key components in the framework, namely, the reference gene expression profiles and the connectivity mapping algorithms. The other key component in this framework, the query gene signature, has been left to users to construct without much consensus on how this should be done, albeit it has been an issue most relevant to end users. As a key input to the connectivity mapping process, gene signature is crucially important in returning biologically meaningful and relevant results. This paper intends to formulate a standardized procedure for constructing high quality gene signatures from a user’s perspective.ResultsWe describe a two-stage process for making quality gene signatures using gene expression data as initial inputs. First, a differential gene expression analysis comparing two distinct biological states; only the genes that have passed stringent statistical criteria are considered in the second stage of the process, which involves ranking genes based on statistical as well as biological significance. We introduce a “gene signature progression” method as a standard procedure in connectivity mapping. Starting from the highest ranked gene, we progressively determine the minimum length of the gene signature that allows connections to the reference profiles (drugs) being established with a preset target false discovery rate. We use a lung cancer dataset and a breast cancer dataset as two case studies to demonstrate how this standardized procedure works, and we show that highly relevant and interesting biological connections are returned. Of particular note is gefitinib, identified as among the candidate therapeutics in our lung cancer case study. Our gene signature was based on gene expression data from Taiwan female non-smoker lung cancer patients, while there is evidence from independent studies that gefitinib is highly effective in treating women, non-smoker or former light smoker, advanced non-small cell lung cancer patients of Asian origin.ConclusionsIn summary, we introduced a gene signature progression method into connectivity mapping, which enables a standardized procedure for constructing high quality gene signatures. This progression method is particularly useful when the number of differentially expressed genes identified is large, and when there is a need to prioritize them to be included in the query signature. The results from two case studies demonstrate that the approach we have developed is capable of obtaining pertinent candidate drugs with high precision.Electronic supplementary materialThe online version of this article (doi:10.1186/s12859-016-1066-x) contains supplementary material, which is available to authorized users.

Highlights

  • Gene expression connectivity mapping has gained much popularity recently with a number of successful applications in biomedical research testifying its utility and promise

  • Among the over 1300 drugs profiled in the camp reference dataset, the following drugs are known to be histone deacetylase (HDAC) inhibitors (HDACi): vorinostat, trichostatin A, valproic acid, HC toxin, sodium phenylbutyrate, scriptaid, and MS-275

  • As can been seen from this table, the four input drugs, HC toxin, sodium phenylbutyrate, scriptaid, and MS-275, were identified as significantly connected to the gene signature, the other major HDAC inhibitors, vorinostat, trichostatin A, and valproic acid were pulled out as significantly connected to the HDACi signature

Read more

Summary

Introduction

Gene expression connectivity mapping has gained much popularity recently with a number of successful applications in biomedical research testifying its utility and promise. Over the past few years gene expression connectivity mapping has gained much popularity among biomedical researchers because of its promising applications as demonstrated by an increasing number of studies in different research areas: drug discovery [1,2,3,4,5], drug repositioning [6,7,8], predictive toxicology [9], and chemical carcinogenicity assessment [10], to name a few. From a user point of view, one important question often asked is: what is the best way of making a gene expression signature that can represent the biological state of interest accurately and be able to return meaningful biological connections?

Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call