Biomolecular Relationships Discovered from Biological Labyrinth and Lost in Ocean of Literature: Community Efforts Can Rescue Until Automated Artificial Intelligence Takes Over

Rajinder Gupta,Shrikant S Mantri

doi:10.3389/fgene.2016.00046

Abstract

Many brilliant minds are at work to decipher the biological labyrinth and as a result immense amount of information about biological entities and their relationships is getting accumulated in the form of published literature (Hunter and Cohen, 2006). To cater the needs of a researcher, many tools are designed to perform tasks of Named Entity Recognition (NER), Information Retrieval (IR), and Information Extraction (IE) viz. A Combined Clinical Concept Annotator (Kang et al., 2012), BANNER (Leaman and Gonzalez, 2008), Biblio-MetReS (Usie et al., 2014), BioTextQuest+ (Papanikolaou et al., 2014), BIOSMILE Web Search (Dai et al., 2008), E3Miner (Lee et al., 2008), EBIMed (Rebholz-Schuhmann et al., 2007), eFIP (Arighi et al., 2011), FACTA+ (Tsuruoka et al., 2008), GNSuite1, iHOP (Hoffmann and Valencia, 2004), MyMiner (Salgado et al., 2012), RLIMS-P(Hu et al., 2005), Anni (Jelier et al., 2008), CoPub (Frijters et al., 2008), MedScan (Novichkova et al., 2003), PPInterFinder (Raja et al., 2012), pGenN (Ding et al., 2015), SciMiner (Hur et al., 2009), BIGNER (Li et al., 2009), hybrid named entity tagger (Raja et al., 2014), and more such tools can be obtained from BIONLP resource2 and in detail analysis of many NLP tools is given by Krallinger et al. (2008) and Fleuren and Alkema (2015). Table Table11 gives an informational and statistical insight into some of these literature mining tools, shedding light on their efficiency translated by statistical parameters viz. F-score, recall, and precision. Many tools are domain specific like kinase family specific but still calls for human intervention for exactitude and thus limit their usage. Moreover, the data output formats are sometimes too vague as name highlighting; to be put to use for bigger literature searches. Table 1 Informational (viz. data used, parameters for evaluation and working platform) and statistical (viz. f-value, recall and precision) insights for a few literature mining tools with their brief description and links to the tools' home page. The naming ambiguity in scientific literature is one of the major concerns for NER and sentence structure for IR and IE. Presently, NER tools need to maintain a comprehensive dictionary of all names, aliases and web-repository specific IDs or have their AI (Artificial Intelligence) defined algorithms trained on many test data sets. Many such dictionaries are available but the list is ever-increasing and so is the training data set. This results into investing more money, time and effort in obtaining a comprehensive list of names, aliases and IDs. A very comprehensive work on NLP can be found on BioNLP3. The availability of manpower or intellect is huge but there is acute scarcity of funds (Bourne et al., 2015), so we have to device optimized approaches to take care of the issues discussed in subsequent section.

Highlights

Biomolecular Relationships Discovered from Biological Labyrinth and Lost in Ocean of Literature: Community Efforts Can Rescue Until Automated Artificial Intelligence Takes
Many brilliant minds are at work to decipher the biological labyrinth and as a result immense amount of information about biological entities and their relationships is getting accumulated in the form of published literature (Hunter and Cohen, 2006)
Named Entity Recognition (NER) tools need to maintain a comprehensive dictionary of all names, aliases and web-repository specific IDs or have their AI (Artificial Intelligence) defined algorithms trained on many test data sets

Summary

Frontiers in Genetics

Biomolecular Relationships Discovered from Biological Labyrinth and Lost in Ocean of Literature: Community Efforts Can Rescue Until Automated Artificial Intelligence Takes. NER tools need to maintain a comprehensive dictionary of all names, aliases and web-repository specific IDs or have their AI (Artificial Intelligence) defined algorithms trained on many test data sets. Many such dictionaries are available but the list is everincreasing and so is the training data set. This results into investing more money, time and effort in obtaining a comprehensive list of names, aliases and IDs. A very comprehensive work on NLP can be found on BioNLP3. Community Efforts to Recover Annotations of funds (Bourne et al, 2015), so we have to device optimized approaches to take care of the issues discussed in subsequent section

ISSUES IN LITERATURE TEXT MINING

MORE DATA LESS INFORMATION

CURRENT PROGRESS

THE WAYS TO PASS THE IMPASSABLE

Concept annotation system for clinical records

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Biomolecular Relationships Discovered from Biological Labyrinth and Lost in Ocean of Literature: Community Efforts Can Rescue Until Automated Artificial Intelligence Takes Over

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in Genetics

Lead the way for us

Journal: Frontiers in Genetics	Publication Date: Mar 31, 2016
License type: cc-by

Similar Papers

Named Entity Recognition: A Survey for Indian Languages
Krishnanjan Bhattacharjee ... Ria Mehta
-
Krishnanjan Bhattacharjee, et. al.Krishnanjan Bhattacharjee ... Ria Mehta
01 Jul 2019
01 Jul 2019

Names, Right or Wrong
K Kettunen ... T Ruokolainen
-
K Kettunen, et. al.K Kettunen ... T Ruokolainen
01 Jun 2017
01 Jun 2017

Harnessing Diversity in Crowds and Machines for Better NER Performance
Oana Inel ... Lora Aroyo
-
Oana Inel, et. al.Oana Inel ... Lora Aroyo
01 Jan 2017
01 Jan 2017

Improving Named Entity Recognition using Bilingual Constraints and Word Alignment
An T Dao ... Thinh H Truong
IOP Conference Series: Materials Science and Engineering | VOL. 435
An T Dao, et. al.An T Dao ... Thinh H Truong
01 Oct 2018
IOP Conference Series: Materials Science and Engineering | VOL. 435

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Biomolecular Relationships Discovered from Biological Labyrinth and Lost in Ocean of Literature: Community Efforts Can Rescue Until Automated Artificial Intelligence Takes Over

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in Genetics