Abstract

DNA copy number alterations (CNAs) are the main genomic events that occur during the initiation and development of cancer. Distinguishing driver aberrant regions from passenger regions, which might contain candidate target genes for cancer therapies, is an important issue. Several methods for identifying cancer-driver genes from multiple cancer patients have been developed for single nucleotide polymorphism (SNP) arrays. However, for NGS data, methods for the SNP array cannot be directly applied because of different characteristics of NGS such as higher resolutions of data without predefined probes and incorrectly mapped reads to reference genomes. In this study, we developed a wavelet-based method for identification of focal genomic alterations for sequencing data (WIFA-Seq). We applied WIFA-Seq to whole genome sequencing data from glioblastoma multiforme, ovarian serous cystadenocarcinoma and lung adenocarcinoma, and identified focal genomic alterations, which contain candidate cancer-related genes as well as previously known cancer-driver genes.

Highlights

  • copy number alterations (CNAs) from NGS data is a still incomplete task

  • When we applied WIFA-Seq to whole-genome sequencing (WGS) data for glioblastoma multiforme (GBM), ovarian serous cystadenocarcinoma (OV) and lung adenocarcinoma (LUAD) obtained from TCGA22–24, we found several well-known focal alterations as well as novel alterations

  • Because a subset of The Cancer Genome Atlas (TCGA) GBM samples has fractured regions with excessive read-depth changes[21], we examined whether WIFA-Seq contolled these excessive changes by comparing them with BIC-seq[18] and TCGA SNP array data[22]

Read more

Summary

Introduction

CNAs from NGS data is a still incomplete task. It was recently reported[21] that in a large fraction of whole genome sequencing (WGS) data for GBM samples from The Cancer Genome Atlas (TCGA)[22], genomes consist of fractured regions with excessive read-depth changes and these regions in the WGS data do not seem to be replicated in the matched SNP array data. Most of the conventional algorithms for detecting focal genomic alterations based on array platform control false discoveries by permuting probes or segmentation results. We previously developed a WIFA method[8] for the SNP array, which is a focal copy number alteration detection algorithm based on a wavelet transform. We developed a wavelet-based method for sequencing data, referred to as WIFA-Seq. Because NGS data have a higher resolution than SNP data and do not have predefined probes, it is challenging to test the statistical significance. Some NGS data have excessive read-depth changes compared to copy number changes in SNP array data. We addressed these issues in WIFA-Seq by improving WIFA. We compared CNA regions from WGS using WIFA-Seq with those from SNP array data using GISTIC 2.014, and identified common and distinct regions

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call