Abstract

The complicated, evolving landscape of cancer mutations poses a formidable challenge to identify cancer genes among the large lists of mutations typically generated in NGS experiments. The ability to prioritize these variants is therefore of paramount importance. To address this issue we developed OncoScore, a text-mining tool that ranks genes according to their association with cancer, based on available biomedical literature. Receiver operating characteristic curve and the area under the curve (AUC) metrics on manually curated datasets confirmed the excellent discriminating capability of OncoScore (OncoScore cut-off threshold = 21.09; AUC = 90.3%, 95% CI: 88.1–92.5%), indicating that OncoScore provides useful results in cases where an efficient prioritization of cancer-associated genes is needed.

Highlights

  • The complicated, evolving landscape of cancer mutations poses a formidable challenge to identify cancer genes among the large lists of mutations typically generated in NGS experiments

  • We analyzed the performance of OncoScore on the Cancer Genes Census (CGC; Supplementary Table 1), a collection of regularly updated and manually annotated genes accepted as causally implicated in oncogenesis[1]

  • To assess the ability of OncoScore to discriminate between cancer and non-cancer genes we generated the OncoScore estimation for the whole CGC dataset and for a manually curated list of genes not associated with cancer

Read more

Summary

Introduction

The complicated, evolving landscape of cancer mutations poses a formidable challenge to identify cancer genes among the large lists of mutations typically generated in NGS experiments. Despite the development of a significant number of tools devoted to cancer driver prediction, limited effort has been dedicated to tools able to generate a gene-centered Oncogenic Score based on the evidence already available in the scientific literature To overcome these limitations, we propose here OncoScore, a bioinformatics text-mining tool capable of automatically scanning the biomedical literature by means of dynamically updatable web queries and measuring gene-specific cancer association in terms of gene citations. We propose here OncoScore, a bioinformatics text-mining tool capable of automatically scanning the biomedical literature by means of dynamically updatable web queries and measuring gene-specific cancer association in terms of gene citations The output of this analysis is a score representing the strength of the association of any gene symbol to cancer, based on the literature available at the time of the analysis. OncoScore is distributed as a R Bioconductor package (https://bioconductor.org/packages/release/bioc/html/OncoScore.html) in order to allow full customization of the algorithm and easy integration in existing NGS pipelines, and as a web tool for easy access by researchers with limited or no experience in bioinformatics (http://www.galseq.com/oncoscore.html)

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.