Abstract

Getting Started in Text Mining: Part Two

Highlights

  • This article is intended to continue where Cohen and Hunter [1] left off in ‘‘Getting Started in Text Mining,’’ an introduction in the January 2008 issue of PLoS Computational Biology which covered the actual mining of text and its digestion into small quanta of computer-manageable information

  • We focus on the downstream questions scientists can ask using text-mining and literature-mining engines

  • We begin at the top left of the figure, which shows the process of information retrieval—how we select relevant documents [2]

Read more

Summary

Introduction

This article is intended to continue where Cohen and Hunter [1] left off in ‘‘Getting Started in Text Mining,’’ an introduction in the January 2008 issue of PLoS Computational Biology which covered the actual mining of text and its digestion into small quanta of computer-manageable information (http://www.ploscompbiol.org/doi/pcbi.0040020). In this overview of the field, we begin by summarizing the major stages of current text-processing pipelines. Named-entity recognition is closely related to the design of controlled terminologies [6] and ontologies for the annotation of texts and experimental data [7]—a process often requiring a monumental community effort [8].

Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call