Ten Thousand Views of Bioinformatics: A Bibliome Perspective

I Kohane

doi:10.1055/s-0038-1638649

Abstract

Summarize the current state bioinformatics research from the published literature in 2008. The entire corpus of publications indexed by the National Library of Medicine in the PubMed repository was reviewed for articles tagged as belonging to the discipline of bioinformatics by Medical Subject Heading or by term in the title or abstract of the article. Selected summary statistics of this corpus were then used to motivate additional exploration. Over ten thousand articles published in 2008 populated the bioinformatics corpus. Significantly, there were at least as many publications in genomics and genetics that used computational techniques but that were not identified as bioinformatics research. Genomics and proteomics continued to be the leading application domains of bioinformatics research but despite the proliferation of human studies, the genes most studied in the corpus were from yeast rather than the human organism. The growth in the genomic studies of human disease was accompanied by a growing critical literature regarding the methods, results and impact of these studies. Concurrently, the availability of full genome sequences at commodity prices has increased the computational challenges of human studies by several orders of magnitude. Further concerns were raised about the consequences of public disclosure of comprehensive or even aggregate genomic data. The impressive size of the bioinformatics bibliome is easily dwarfed by the challenges generated by the continued increased growth of high-throughput biological data sets. The demand for bioinformatics expertise and tools is therefore likely to continue to increase, at least in the near term.

Full Text