Development Of Natural Language Processing Research Articles

BackgroundThe ShARe/CLEF eHealth challenge lab aims to stimulate development of natural language processing and information retrieval technologies to aid patients in understanding their clinical reports. In clinical text, acronyms and abbreviations, also referenced as short forms, can be difficult for patients to understand. For one of three shared tasks in 2013 (Task 2), we generated a reference standard of clinical short forms normalized to the Unified Medical Language System. This reference standard can be used to improve patient understanding by linking to web sources with lay descriptions of annotated short forms or by substituting short forms with a more simplified, lay term.MethodsIn this study, we evaluate 1) accuracy of participating systems’ normalizing short forms compared to a majority sense baseline approach, 2) performance of participants’ systems for short forms with variable majority sense distributions, and 3) report the accuracy of participating systems’ normalizing shared normalized concepts between the test set and the Consumer Health Vocabulary, a vocabulary of lay medical terms.ResultsThe best systems submitted by the five participating teams performed with accuracies ranging from 43 to 72 %. A majority sense baseline approach achieved the second best performance. The performance of participating systems for normalizing short forms with two or more senses with low ambiguity (majority sense greater than 80 %) ranged from 52 to 78 % accuracy, with two or more senses with moderate ambiguity (majority sense between 50 and 80 %) ranged from 23 to 57 % accuracy, and with two or more senses with high ambiguity (majority sense less than 50 %) ranged from 2 to 45 % accuracy. With respect to the ShARe test set, 69 % of short form annotations contained common concept unique identifiers with the Consumer Health Vocabulary. For these 2594 possible annotations, the performance of participating systems ranged from 50 to 75 % accuracy.ConclusionShort form normalization continues to be a challenging problem. Short form normalization systems perform with moderate to reasonable accuracies. The Consumer Health Vocabulary could enrich its knowledge base with missed concept unique identifiers from the ShARe test set to further support patient understanding of unfamiliar medical terms.

Read full abstract

PurposeThe purpose of this paper is to provide a literature review of the principal formats and frameworks that have been used in the last 20 years to exchange linguistic resources. It aims to give special attention to the most recent approaches to publishing linguistic linked open data on the Web.Design/methodology/approachResearch papers published since 1990 on the use of various formats, standards, frameworks and methods to exchange linguistic information were divided into two main categories: those proposing specific schemas and syntaxes to suit the requirements of a given type of linguistic data (these are referred to as offline approaches), and those adopting the linked data (LD) initiative and the semantic web technologies to support the interoperability of heterogeneous linguistic resources. For each paper, the type of linguistic resource exchanged, the framework/format used, the interoperability approach taken and the related projects were identified.FindingsThe information gathered in the survey reflects an increase in recent years in approaches adopting the LD initiative. This is due to the fact that the structural and syntactic issues which arise when addressing the interoperability of linguistic resources can be solved by applying semantic web technologies. What remains an open issue in the field of computational linguistics is the development of knowledge artefacts and mechanisms to support the alignment of the different aspects of linguistic resources in order to guarantee semantic and conceptual interoperability in the linked open data (LOD) cloud. Ontologies have proved to be of great use in achieving this goal.Research limitations/implicationsThe research presented here is by no means a comprehensive or all‐inclusive survey of all existing approaches to the exchange of linguistic resources. Rather, the aim was to highlight, analyze and categorize the most significant advances in the field.Practical implicationsThis survey has practical implications for computational linguists and for every application requiring new developments in natural language processing. In addition, multilingual issues can be better addressed when semantic interoperability of heterogeneous linguistic resources is achieved.Originality/valueThe paper provides a survey of past and present research and developments addressing the interoperability of linguistic resources, including those where the linked data initiative has been adopted.

Read full abstract

Development Of Natural Language Processing Research Articles

Related Topics

Articles published on Development Of Natural Language Processing

Novel Linguistic Steganography Based on Character-Level Text Generation

The Translator’s Extended Mind

Relating Mori's Uncanny Valley in generating conversations with artificial affective communication and natural language processing.

Generation and evaluation of artificial mental health records for Natural Language Processing

A method of computing conceptual semantic similarity based on part-whole relationship

TS-CSW: text steganalysis and hidden capacity estimation based on convolutional sliding windows

An Intelligent Chatbot System Based on Entity Extraction Using RASA NLU and Neural Network

A Natural Language Processing Pipeline of Chinese Free-Text Radiology Reports for Liver Cancer Diagnosis

Research on Mongolian-Chinese machine translation based on the end-to-end neural network

Language theoretic properties of regular DAG languages

Contex Free Grammer For Turkish

Automated Sentence Boundary Detection in Modern Standard Arabic Transcripts using Deep Neural Networks

Normalizing acronyms and abbreviations to aid patient understanding of clinical texts: ShARe/CLEF eHealth Challenge 2013, Task 2.

Facilitate Knowledge Exploration with Storytelling

An Extension of Standard Latent Dirichlet Allocation to Multiple Corpora

Tools to accurately identify veterans who undergo molecular diagnostic testing.

A survey on the exchange of linguistic resources

NLP@Desktop

Language and the Boundaries of Research: Media Monitoring Technologies in International Media Research

Use of NLP Tools in CALL System for Arabic

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Development Of Natural Language Processing Research Articles

Related Topics

Articles published on Development Of Natural Language Processing

Novel Linguistic Steganography Based on Character-Level Text Generation

The Translator’s Extended Mind

Relating Mori's Uncanny Valley in generating conversations with artificial affective communication and natural language processing.

Generation and evaluation of artificial mental health records for Natural Language Processing

A method of computing conceptual semantic similarity based on part-whole relationship

TS-CSW: text steganalysis and hidden capacity estimation based on convolutional sliding windows

An Intelligent Chatbot System Based on Entity Extraction Using RASA NLU and Neural Network

A Natural Language Processing Pipeline of Chinese Free-Text Radiology Reports for Liver Cancer Diagnosis

Research on Mongolian-Chinese machine translation based on the end-to-end neural network

Language theoretic properties of regular DAG languages

Contex Free Grammer For Turkish

Automated Sentence Boundary Detection in Modern Standard Arabic Transcripts using Deep Neural Networks

Normalizing acronyms and abbreviations to aid patient understanding of clinical texts: ShARe/CLEF eHealth Challenge 2013, Task 2.

Facilitate Knowledge Exploration with Storytelling

An Extension of Standard Latent Dirichlet Allocation to Multiple Corpora

Tools to accurately identify veterans who undergo molecular diagnostic testing.

A survey on the exchange of linguistic resources

NLP@Desktop

Language and the Boundaries of Research: Media Monitoring Technologies in International Media Research

Use of NLP Tools in CALL System for Arabic