Detecting Statistically Significant Common Insertion Sites in Retroviral Insertional Mutagenesis Screens

Jeroen de Ridder,Lodewyk Wessels,Anthony Uren,Marcel Reinders,Jaap Kool

doi:10.1371/journal.pcbi.0020166

Abstract

Retroviral insertional mutagenesis screens, which identify genes involved in tumor development in mice, have yielded a substantial number of retroviral integration sites, and this number is expected to grow substantially due to the introduction of high-throughput screening techniques. The data of various retroviral insertional mutagenesis screens are compiled in the publicly available Retroviral Tagged Cancer Gene Database (RTCGD). Integrally analyzing these screens for the presence of common insertion sites (CISs, i.e., regions in the genome that have been hit by viral insertions in multiple independent tumors significantly more than expected by chance) requires an approach that corrects for the increased probability of finding false CISs as the amount of available data increases. Moreover, significance estimates of CISs should be established taking into account both the noise, arising from the random nature of the insertion process, as well as the bias, stemming from preferential insertion sites present in the genome and the data retrieval methodology. We introduce a framework, the kernel convolution (KC) framework, to find CISs in a noisy and biased environment using a predefined significance level while controlling the family-wise error (FWE) (the probability of detecting false CISs). Where previous methods use one, two, or three predetermined fixed scales, our method is capable of operating at any biologically relevant scale. This creates the possibility to analyze the CISs in a scale space by varying the width of the CISs, providing new insights in the behavior of CISs across multiple scales. Our method also features the possibility of including models for background bias. Using simulated data, we evaluate the KC framework using three kernel functions, the Gaussian, triangular, and rectangular kernel function. We applied the Gaussian KC to the data from the combined set of screens in the RTCGD and found that 53% of the CISs do not reach the significance threshold in this combined setting. Still, with the FWE under control, application of our method resulted in the discovery of eight novel CISs, which each have a probability less than 5% of being false detections.

Highlights

Retroviral Tagging In retroviral insertional mutagenesis experiments, genes involved in the development of cancer are identified by determining the loci of viral insertions from tumors induced by retroviruses in mice [1,2]
The common insertion site (CIS) are displayed in the scale space, which offers the opportunity to evaluate the lifespan of CISs across multiple scales
Current methods do not control the number of falsely detected CISs without changing the scale of the putative CIS, and fail when applied to large datasets

Summary

Introduction

Retroviral Tagging In retroviral insertional mutagenesis experiments, genes involved in the development of cancer are identified by determining the loci of viral insertions from tumors induced by retroviruses in mice [1,2]. Based on the results depicted, we can conclude that for small scale parameters no CISs were discarded This is in accordance with the background bias model used in the analysis: a Gaussian distribution of 65 k bp does not justify the removal of small CISs (see Figure S6). Second (Figure 10B), the simulated background data is acquired by generating a realization of the insertions according to the density estimate from step A. Steps A and B are repeated to yield a distribution of insertion density estimates that follow the background for every location in the genome.

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: PLoS Computational Biology	Publication Date: Jan 1, 2006
Citations: 136	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Detecting Statistically Significant Common Insertion Sites in Retroviral Insertional Mutagenesis Screens

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLoS Computational Biology

Lead the way for us

Similar Papers

Co-occurrence analysis of insertional mutagenesis data reveals cooperating oncogenes
Jeroen De Ridder ... Jan Bot
Bioinformatics | VOL. 23
Jeroen De Ridder, et. al.Jeroen De Ridder ... Jan Bot
01 Jul 2007
Bioinformatics | VOL. 23

Retroviral Insertional Mutagenesis In Egr1+/- mice, Haploinsufficient For a Human Del(5q) Myeloid Leukemia Gene, Develop Myeloid Neoplasms With Proviral Insertions In Genes Syntenic To Human 5q
Angela Stoddart ... Michelle M.Le Beau
Blood | VOL. 122
Angela Stoddart, et. al.Angela Stoddart ... Michelle M.Le Beau
15 Nov 2013
Blood | VOL. 122

Cooperative Pathways to Acute Myeloid Leukemia Include the Combining of Transcription Factor Alterations: PML-RARα Cooperates with SOX4.
Scott C Kogan ... David A Largaespada
Blood | VOL. 104
Scott C Kogan, et. al.Scott C Kogan ... David A Largaespada
16 Nov 2004
Blood | VOL. 104

New methods for finding common insertion sites and co-occurring common insertion sites in transposon- and virus-based genetic screens
Tracy L Bergemann ... Haoyu Yu
Nucleic Acids Research | VOL. 40
Tracy L Bergemann, et. al.Tracy L Bergemann ... Haoyu Yu
11 Jan 2012
Nucleic Acids Research | VOL. 40

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Detecting Statistically Significant Common Insertion Sites in Retroviral Insertional Mutagenesis Screens

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLoS Computational Biology