Abstract

Recurrent copy number alterations (CNAs) play an important role in cancer genesis. While a number of computational methods have been proposed for identifying such CNAs, their relative merits remain largely unknown in practice since very few efforts have been focused on comparative analysis of the methods. To facilitate studies of recurrent CNA identification in cancer genome, it is imperative to conduct a comprehensive comparison of performance and limitations among existing methods. In this paper, six representative methods proposed in the latest six years are compared. These include one-stage and two-stage approaches, working with raw intensity ratio data and discretized data respectively. They are based on various techniques such as kernel regression, correlation matrix diagonal segmentation, semi-parametric permutation and cyclic permutation schemes. We explore multiple criteria including type I error rate, detection power, Receiver Operating Characteristics (ROC) curve and the area under curve (AUC), and computational complexity, to evaluate performance of the methods under multiple simulation scenarios. We also characterize their abilities on applications to two real datasets obtained from cancers with lung adenocarcinoma and glioblastoma. This comparison study reveals general characteristics of the existing methods for identifying recurrent CNAs, and further provides new insights into their strengths and weaknesses. It is believed helpful to accelerate the development of novel and improved methods.

Highlights

  • Identifying recurrent copy number alterations (CNAs) in cancer genomes is an important step in locating cancer driver genes and understanding the mechanisms of tumor initiation

  • To correct the effect of multiple hypotheses testing, KC-SMART adopts Bonferroni strategy by multiplying the assessed p-values using the total number of locations being tested

  • The input data to CMDS is largely similar to KC-SMART

Read more

Summary

Introduction

Identifying recurrent copy number alterations (CNAs) in cancer genomes is an important step in locating cancer driver genes and understanding the mechanisms of tumor initiation. The most common reason for missing some well-known driver mutations is that almost all cancers are heterogeneous [6], indicating that many recurrent CNAs only appear in a subset of samples (i.e., samples within subtypes) and their frequencies are less-extreme across the whole samples. For this challenge, a number of statistical and computational methods with promising results have been reported. Many of them were reviewed and discussed by Rueda and Diaz-Uriarte in their latest paper [14]

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.