Semantic Similarity Techniques Research Articles

BackgroundSemantic textual similarity is a common task in the general English domain to assess the degree to which the underlying semantics of 2 text segments are equivalent to each other. Clinical Semantic Textual Similarity (ClinicalSTS) is the semantic textual similarity task in the clinical domain that attempts to measure the degree of semantic equivalence between 2 snippets of clinical text. Due to the frequent use of templates in the Electronic Health Record system, a large amount of redundant text exists in clinical notes, making ClinicalSTS crucial for the secondary use of clinical text in downstream clinical natural language processing applications, such as clinical text summarization, clinical semantics extraction, and clinical information retrieval.ObjectiveOur objective was to release ClinicalSTS data sets and to motivate natural language processing and biomedical informatics communities to tackle semantic text similarity tasks in the clinical domain.MethodsWe organized the first BioCreative/OHNLP ClinicalSTS shared task in 2018 by making available a real-world ClinicalSTS data set. We continued the shared task in 2019 in collaboration with National NLP Clinical Challenges (n2c2) and the Open Health Natural Language Processing (OHNLP) consortium and organized the 2019 n2c2/OHNLP ClinicalSTS track. We released a larger ClinicalSTS data set comprising 1642 clinical sentence pairs, including 1068 pairs from the 2018 shared task and 1006 new pairs from 2 electronic health record systems, GE and Epic. We released 80% (1642/2054) of the data to participating teams to develop and fine-tune the semantic textual similarity systems and used the remaining 20% (412/2054) as blind testing to evaluate their systems. The workshop was held in conjunction with the American Medical Informatics Association 2019 Annual Symposium.ResultsOf the 78 international teams that signed on to the n2c2/OHNLP ClinicalSTS shared task, 33 produced a total of 87 valid system submissions. The top 3 systems were generated by IBM Research, the National Center for Biotechnology Information, and the University of Florida, with Pearson correlations of r=.9010, r=.8967, and r=.8864, respectively. Most top-performing systems used state-of-the-art neural language models, such as BERT and XLNet, and state-of-the-art training schemas in deep learning, such as pretraining and fine-tuning schema, and multitask learning. Overall, the participating systems performed better on the Epic sentence pairs than on the GE sentence pairs, despite a much larger portion of the training data being GE sentence pairs.ConclusionsThe 2019 n2c2/OHNLP ClinicalSTS shared task focused on computing semantic similarity for clinical text sentences generated from clinical notes in the real world. It attracted a large number of international teams. The ClinicalSTS shared task could continue to serve as a venue for researchers in natural language processing and medical informatics communities to develop and improve semantic textual similarity techniques for clinical text.

Semantic Similarity Techniques Research Articles

Related Topics

Articles published on Semantic Similarity Techniques

Intelligence model on sequence-based prediction of PPI using AISSO deep concept with hyperparameter tuning process

A novel self-supervised sentiment classification approach using semantic labeling based on contextual embeddings

A novel optimized deep learning method for protein-protein prediction in bioinformatics

Summarization of Software Bug Report based on Sentence Semantic Similarity (SSBRSSS) Technique

Finding Patient Zero and Tracking Narrative Changes in the Context of Online Disinformation Using Semantic Similarity Analysis

MAYA: Exploring multiform attributes of node to align YANG data models

Extractive Multi-document Text Summarization Leveraging Hybrid Semantic Similarity Measures

A computational framework for modeling functional protein-protein interactions.

A state-of-the-art survey on semantic similarity for document clustering using GloVe and density-based algorithms

AraCap: A hybrid deep learning architecture for Arabic Image Captioning

The 2019 n2c2/OHNLP Track on Clinical Semantic Textual Similarity: Overview.

Integrating COBIT 5 PAM and TIPA for ITIL Using an Ontology Matching System

A Semantic Similarity Evaluation for Healthcare Ontologies Matching to HL7 FHIR Resources.

Classification of Semantic Similarity Technique between Word Pairs using Word Net

Spam e-mail classification for the Internet of Things environment using semantic similarity approach

Unsupervised and supervised text similarity systems for automated identification of national implementing measures of European directives

Improving Polarity Classification for Financial News Using Semantic Similarity Techniques

A k-means based co-clustering (kCC) algorithm for sparse, high dimensional data

Role Term-Based Semantic Similarity Technique for Idea Plagiarism Detection

A novel framework for social web forums’ thread ranking based on semantics and post quality features

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Semantic Similarity Techniques Research Articles

Related Topics

Articles published on Semantic Similarity Techniques

Intelligence model on sequence-based prediction of PPI using AISSO deep concept with hyperparameter tuning process

A novel self-supervised sentiment classification approach using semantic labeling based on contextual embeddings

A novel optimized deep learning method for protein-protein prediction in bioinformatics

Summarization of Software Bug Report based on Sentence Semantic Similarity (SSBRSSS) Technique

Finding Patient Zero and Tracking Narrative Changes in the Context of Online Disinformation Using Semantic Similarity Analysis

MAYA: Exploring multiform attributes of node to align YANG data models

Extractive Multi-document Text Summarization Leveraging Hybrid Semantic Similarity Measures

A computational framework for modeling functional protein-protein interactions.

A state-of-the-art survey on semantic similarity for document clustering using GloVe and density-based algorithms

AraCap: A hybrid deep learning architecture for Arabic Image Captioning

The 2019 n2c2/OHNLP Track on Clinical Semantic Textual Similarity: Overview.

Integrating COBIT 5 PAM and TIPA for ITIL Using an Ontology Matching System

A Semantic Similarity Evaluation for Healthcare Ontologies Matching to HL7 FHIR Resources.

Classification of Semantic Similarity Technique between Word Pairs using Word Net

Spam e-mail classification for the Internet of Things environment using semantic similarity approach

Unsupervised and supervised text similarity systems for automated identification of national implementing measures of European directives

Improving Polarity Classification for Financial News Using Semantic Similarity Techniques

A k-means based co-clustering (kCC) algorithm for sparse, high dimensional data

Role Term-Based Semantic Similarity Technique for Idea Plagiarism Detection

A novel framework for social web forums’ thread ranking based on semantics and post quality features