Abstract
Getting Started in Text Mining: Part Two
Highlights
This article is intended to continue where Cohen and Hunter [1] left off in ‘‘Getting Started in Text Mining,’’ an introduction in the January 2008 issue of PLoS Computational Biology which covered the actual mining of text and its digestion into small quanta of computer-manageable information
We focus on the downstream questions scientists can ask using text-mining and literature-mining engines
We begin at the top left of the figure, which shows the process of information retrieval—how we select relevant documents [2]
Summary
This article is intended to continue where Cohen and Hunter [1] left off in ‘‘Getting Started in Text Mining,’’ an introduction in the January 2008 issue of PLoS Computational Biology which covered the actual mining of text and its digestion into small quanta of computer-manageable information (http://www.ploscompbiol.org/doi/pcbi.0040020). In this overview of the field, we begin by summarizing the major stages of current text-processing pipelines. Named-entity recognition is closely related to the design of controlled terminologies [6] and ontologies for the annotation of texts and experimental data [7]—a process often requiring a monumental community effort [8].
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have