RGloVe: An Improved Approach of Global Vectors for Distributional Entity Relation Representation

Ziyan Chen,Kun Fu,Yu Huang,Xingyu Fu,Yuexian Liang,Yang Wang

doi:10.3390/a10020042

Abstract

Most of the previous works on relation extraction between named entities are often limited to extracting the pre-defined types; which are inefficient for massive unlabeled text data. Recently; with the appearance of various distributional word representations; unsupervised methods for many natural language processing (NLP) tasks have been widely researched. In this paper; we focus on a new finding of unsupervised relation extraction; which is called distributional relation representation. Without requiring the pre-defined types; distributional relation representation aims to automatically learn entity vectors and further estimate semantic similarity between these entities. We choose global vectors (GloVe) as our original model to train entity vectors because of its excellent balance between local context and global statistics in the whole corpus. In order to train model more efficiently; we improve the traditional GloVe model by using cosine similarity between entity vectors to approximate the entity occurrences instead of dot product. Because cosine similarity can convert vector to unit vector; it is intuitively more reasonable and more easily converge to a local optimum. We call the improved model RGloVe. Experimental results on a massive corpus of Sina News show that our proposed model outperforms the traditional global vectors. Finally; a graph database of Neo4j is introduced to store these relationships between named entities. The most competitive advantage of Neo4j is that it provides a highly accessible way to query the direct and indirect relationships between entities.

Highlights

With the explosive growth and easy accessibility of web documents, extracting the useful nuggets from the irrelevant and redundant messages becomes a cognitively demanding and time consuming task
For RQ1, this paper presents an improved model of global vectors called RGloVe based on the idea of distributed representation
For the task of distributional relation representation, we propose an improved global vectors model called RGloVe which can train the word vectors more effectively

Summary

Introduction

With the explosive growth and easy accessibility of web documents, extracting the useful nuggets from the irrelevant and redundant messages becomes a cognitively demanding and time consuming task. Under this circumstance, information extraction is proposed to extract the structured data from text documents. The automatic content extraction (ACE) program [1] provides annotated corpus and evaluation criteria for a series of information extraction tasks. Traditional relation extraction is often limited to extracting the pre-defined types. ACE 2003 defines five relation types, including AT (location relationships), NEAR (to identify relative locations), PART (part-whole relationships), ROLE (the role a person plays in an organization) and Algorithms 2017, 10, 42; doi:10.3390/a10020042 www.mdpi.com/journal/algorithms

Objectives

Methods

Findings

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Algorithms	Publication Date: Apr 17, 2017
Citations: 7	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

RGloVe: An Improved Approach of Global Vectors for Distributional Entity Relation Representation

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Algorithms

Lead the way for us

Similar Papers

Relational Triple Extraction with Relation-Attentive Contextual Semantic Representations
Baolin Jia ... Junli Lin
-
Baolin Jia, et. al.Baolin Jia ... Junli Lin
01 Jan 2021
01 Jan 2021

Using clinical Natural Language Processing for health outcomes research: Overview and actionable suggestions for future advances.
Sumithra Velupillai ... Hanna Suominen
Journal of biomedical informatics | VOL. 88
Sumithra Velupillai, et. al.Sumithra Velupillai ... Hanna Suominen
24 Oct 2018
Journal of biomedical informatics | VOL. 88

Advanced Corpus Annotation Strategies for NLP. Applications in Automatic Summarization and Text Classiﬁcation

-

01 Jan 2020
01 Jan 2020

Applying Natural Language Processing to Textual Data From Clinical Data Warehouses: Systematic Review.
Adrien Bazoge ... Emmanuel Morin
JMIR Medical Informatics | VOL. 11
Adrien Bazoge, et. al.Adrien Bazoge ... Emmanuel Morin
15 Dec 2023
JMIR Medical Informatics | VOL. 11

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

RGloVe: An Improved Approach of Global Vectors for Distributional Entity Relation Representation

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Algorithms