Abstract

RNA sequencing data are becoming a major method of choice to study transcriptomes, including the mapping of gene expression quantitative trait loci (eQTLs). RNA sample contamination or swapping is a serious problem for downstream analysis and may result in false discovery and lose power to detect the true biological relationships. When genetic data are available, for example, in eQTL studies or samples have been previously genotyped or DNA sequenced, it is possible to combine genetic data and RNA-seq data to detect sample contamination and resolve sample swapping problems. In this article, we introduce a tool (IDCheck) that allows easy assessment of concordance between genotype (from SNP arrays or DNA sequencing) and gene expression (RNA-seq) samples. IDCheck compares the identity of RNA-seq reads and SNP genotypes using a likelihood-based method. Based on maximum likelihood estimates of relevant parameters, we can detect sample contamination and identify correct sample pairs when swapping occurs. Our tool provides an efficient and convenient way to evaluate and resolve these problems. A complete description of the software is included on the application home page. The software is freely available in the public domain at http://eqtl.rc.fas.harvard.edu/idcheck/. Supplementary data are available at Bioinformatics online.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.