Abstract

Rapid advances in next generation sequencing (NGS) technologies provide many opportunities to identify associations between genetic sequence variants (GSVs) and diseases, which may lead to better clinical diagnosis and treatments. The goal of this project is to efficiently identify candidate GSVs that may be associated with cancer using prostate data as an example. For this purpose, we analyzed the NGS data of 503 patients with prostate cancer (PrCa), which are in variant call format (VCF) files in The Cancer Genome Atlas project released through the National Cancer Institute Genomic Data Commons. We proposed a systematic workflow to first split the NGS data by chromosome for efficient preprocessing, then compare the statistical properties of GSVs between cancer and normal samples, and finally identify the GSVs with statistically significant associations with PrCa. Two popular tools FATHMM and PROVEAN were also employed to predict functional effects of GSVs. These results provided a list of 104 high-scoring nonsynonymous GSVs that deserve further investigation for possible pathogenicity. One particular GSV on this list, located within the SPOP gene on chromosome 17, is also significantly associated with PrCa. Methods presented in this paper will be implemented into the OncoMiner pipeline, enabling their application to various types of cancers.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.