Abstract

Abstract Patient-derived xenograft (PDX) models of human tumors are an important and widely used platform for cancer research. Cancer drug development relies on PDX models to screen drugs and characterize tumor biology for potential drug targets. It has been well established that PDX models maintain similar biology as their original tumors, including histological patterning, gene expression, single-nucleotide variants, and copy number alterations. Using short-read sequencing technology to profile and characterize genomic alterations within PDX tumor models is becoming a common practice in cancer research. Mouse read contamination is a relevant source of noise in PDX tumor sequencing data and needs to be addressed prior to downstream analyses. Therefore, a key consideration for downstream analysis of PDX sequencing data, such as determining variant calls or gene expression values, is effectively removing contaminating mouse sequence. Removing contamination from PDX sequencing data is necessary for accurate and reproducible downstream analyses. A limited number of studies establishing best practices for handling PDX sequencing data exist. Thus, we set out to compare different strategies for removing mouse contamination from PDX tumor sequencing data for DNA and RNA using a set of controlled experimental in silico datasets and data from PDX tumors. We designed a set of in silico experiments using these sequencing data to assess a range of approaches for removing contaminating mouse reads from human data. Our experiments used a set of publically available human and mouse DNA and RNA sequencing data available at the SRA site. Subsets of the raw human and mouse reads were mixed at different ratios and analyzed with five different approaches: 1) raw alignment to the human reference genome, 2) filtering with the Xenome algorithm followed by alignment to the human genome, 3) alignment to the human reference genome followed by filtering with the XenofilteR algorithm, 4) mouse-human hybrid reference genome alignment, and 5) our novel NextCODE approach. We assessed the sensitivity and specificity of each procedure for removing mouse sequence and maintaining human sequence for downstream analyses. We also assessed the effects of each filtering procedure on gene expression quantification and variant calling. Our results introduce a novel, improved method for removing mouse DNA, facilitating better-quality data for downstream analysis. Citation Format: Ryan P. Abo, Zehua Chen, Shannon Bailey, Hao Wang, Sharvari Gujja, Pengwei Yang, Jim Lund, Jeff Gulcher, Tom Chittenden. Comprehensive assessment of mouse contamination removal strategies from patient-derived xenograft model sequencing data [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2017; 2017 Apr 1-5; Washington, DC. Philadelphia (PA): AACR; Cancer Res 2017;77(13 Suppl):Abstract nr 3586. doi:10.1158/1538-7445.AM2017-3586

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call