Getting started in text mining.

K Bretonnel Cohen,Lawrence Hunter

doi:10.1371/journal.pcbi.0040020

Abstract

Getting started in text mining.

Highlights

Text mining is the use of automated methods for exploiting the enormous amount of knowledge available in the biomedical literature
Text mining specialists are more likely to build systems that are likely to get them published in computational linguistics conferences
Biologists seem to be better at one of the crucial first steps identified above: defining the goals of the system, and not hesitating to define those goals based on utility, rather than on presumed publishability in the computational linguistics literature

Summary

Introduction

Text mining is the use of automated methods for exploiting the enormous amount of knowledge available in the biomedical literature. Breast cancer could be referred to as breast cancer, carcinoma of the breast, or mammary neoplasm These variability issues challenge more sophisticated systems, as well; we discuss ways of coping with them in Text S1. (See [3] for an early rule-based system, and [4] for a discussion of rule-based approaches to various biomedical text mining tasks.) At one end of the spectrum, a simple rule-based system might use hardcoded patterns—for example, ,gene. The former is a cadhedrin, and is associated with tumor suppression and with bipolar disorder, while the latter is a thrombospondin receptor associated with atherosclerosis, platelet glycoprotein deficiency, hyperlipidemia, and insulin resistance, to name just a few phenotypes These ambiguities are not trivial: if your analysis is wrong, you miss or erroneously extract information on relations between molecular biology and human disease. A third approach— post-hoc judging of system outputs— will often suffice for publication, but is often not practical for system development since it cannot be repeated quickly and frequently

Conclusions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: PLoS Computational Biology	Publication Date: Jan 1, 2008
Citations: 203	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Getting started in text mining.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLoS Computational Biology

Lead the way for us

Similar Papers

BioBERT: a pre-trained biomedical language representation model for biomedical text mining.
Jinhyuk Lee ... Chan Ho So
Bioinformatics | VOL. 36
Jinhyuk Lee, et. al.Jinhyuk Lee ... Chan Ho So
10 Sep 2019
Bioinformatics | VOL. 36

6.04 - Text Mining
M Krallinger ... A Valencia
Comprehensive Biomedical Physics | VOL. 6
M Krallinger, et. al.M Krallinger ... A Valencia
01 Jan 2014
Comprehensive Biomedical Physics | VOL. 6

Tutorial on text mining of biomedical literature repositories

-

19 Dec 2011
19 Dec 2011

Text Mining on Big and Complex Biomedical Literature
Boya Xie ... Di Wu
-
Boya Xie, et. al.Boya Xie ... Di Wu
01 Jan 2018
01 Jan 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Getting started in text mining.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLoS Computational Biology