Abstract

Large-scale sequencing projects, such as The Cancer Genome Atlas (TCGA) and the International Cancer Genome Consortium (ICGC), have generated high throughput sequencing and molecular profiling data sets, but it is still challenging to identify potentially causal changes in cellular processes in cancer as well as in other diseases in an automated fashion. We developed the netboxr package written in the R programming language, which makes use of the NetBox algorithm to identify candidate cancer-related functional modules. The algorithm makes use of a data-driven, network-based approach that combines prior knowledge with a network clustering algorithm, obviating the need for and the limitation of independently curated functionally labeled gene sets. The method can combine multiple data types, such as mutations and copy number alterations, leading to more reliable identification of functional modules. We make the tool available in the Bioconductor R ecosystem for applications in cancer research and cell biology. The netboxr package is free and open-sourced under the GNU GPL-3 license R package available at https://www.bioconductor.org/packages/release/bioc/html/netboxr.html.

Highlights

  • Large-scale sequencing consortia such as The Cancer Genome Atlas (TCGA) [1] and the Interactional Cancer Genome Consortium (ICGC) [2] provide detailed genomic alteration profiling in many cancer types

  • Many methods based on the recurrence of genomic alterations, i.e., the frequency of occurrence in sets of tumor samples, have been developed to identify alterations likely to be functional in oncogenesis or cancer progression, addressing an important question in the field of precision oncology [3]

  • We have developed the NetBox algorithm that seeks to automate the identification of candidate oncogenic processes and involved genes, which allows the quantitative analysis of genomic alterations in the context of known signaling pathway connectivity [4]

Read more

Summary

Summary

We developed the netboxr package written in the R programming language, which makes use of the NetBox algorithm to identify candidate cancer-related functional modules. The algorithm makes use of a data-driven, network-based approach that combines prior knowledge with a network clustering algorithm, obviating the need for and the limitation of independently curated functionally labeled gene sets. The method can combine multiple data types, such as mutations and copy number alterations, leading to more reliable identification of functional modules. The netboxr package is free and open-sourced under the GNU GPL-3 license R package available at https://www.bioconductor.org/packages/release/bioc/html/netboxr.html.

Introduction
Methods
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call