Abstract

BackgroundPathways with members that have known relevance to a disease are used to support hypotheses generated from analyses of gene expression and proteomic studies. Using cancer as an example, the pitfalls of searching pathways databases as support for genes and proteins that could represent false discoveries are explored.FindingsThe frequency with which networks could be generated from 100 instances each of randomly selected five and ten genes sets as input to MetaCore, a commercial pathways database, was measured. A PubMed search enumerated cancer-related literature published for any gene in the networks. Using three, two, and one maximum intervening step between input genes to populate the network, networks were generated with frequencies of 97%, 77%, and 7% using ten gene sets and 73%, 27%, and 1% using five gene sets. PubMed reported an average of 4225 cancer-related articles per network gene.DiscussionThis can be attributed to the richly populated pathways databases and the interest in the molecular basis of cancer. As information sources become enriched, they are more likely to generate plausible mechanisms for false discoveries.

Highlights

  • Modern research into the molecular biology of cancer can be traced back to the introduction of Knudson’s “two-hit” hypothesis based on his discovery of a second somatic mutation in tumors from patients with a germline retinoblastoma gene mutation [1]

  • As our knowledge about pathways increases, more genes are assigned to networks and the probability of generating a network from a randomly drawn set of genes is constantly increasing

  • Coincident with this is the fact that the number of publications relating genes to cancer is increasing; the probability of finding a paper on cancer that includes a gene listed as part of a “discovered” pathway is increasing over time

Read more

Summary

Introduction

Modern research into the molecular biology of cancer can be traced back to the introduction of Knudson’s “two-hit” hypothesis based on his discovery of a second somatic mutation in tumors from patients with a germline retinoblastoma gene mutation [1]. Twenty years after the “two-hit theory” was proposed, a study of colorectal cancers demonstrated a more complex scenario with most tumor samples demonstrating mutations in four to five genes [2]. A more recent study discovered that there could be up to 20 mutated gene that have a role in the evolution of a type of cancer [3]. There can be many pathways that can be assigned a role for even a single type of cancer, and an even larger number of genes and/or proteins when the entire network or pathways is considered. Using cancer as an example, the pitfalls of searching pathways databases as support for genes and proteins that could represent false discoveries are explored

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.