Term Vector Space Research Articles

High dimensionality problem is an important concern for short text classification due to its effect on computational cost and accuracy of classifiers. Also, short text data, besides being high dimensional, has an incomplete, inconsistent and sparse structure. Selection of important features that provide a better representation is a solution for high dimensionality problem. In this study, we developed a novel filter feature selection method, Proportional Rough Feature Selector (PRFS), which uses the rough set for a regional distinction according to the value set of term to identify documents that exactly belong to a class or that is possibly belong to a class. Documents possible to belong to a class are penalized by multiplying with a coefficient named α. Additionally, the effect of sparsity in the term vector space is calculated with the help of rough set. The PRFS is compared with state-of-the-art filter feature selection methods such as Gini index, information gain, distinguishing feature selector, recently proposed max–min ratio, and normalized difference measure methods. The comparison is carried out using various feature sizes on four different short text datasets with a Macro-F1 success measure. Experimental results demonstrated that the PRFS offers either better or competitive performance with respect to other feature selection methods in terms of Macro-F1. This study may be a pioneering study in this research field as it proposes a novel feature selection method for short text classification using a rough set theory.

Read full abstract

The increasing concern with misinformation has stimulated research efforts on automatic fact checking. The recentlyreleased FEVER dataset introduced a benchmark factverification task in which a system is asked to verify a claim using evidential sentences from Wikipedia documents. In this paper, we present a connected system consisting of three homogeneous neural semantic matching models that conduct document retrieval, sentence selection, and claim verification jointly for fact extraction and verification. For evidence retrieval (document retrieval and sentence selection), unlike traditional vector space IR models in which queries and sources are matched in some pre-designed term vector space, we develop neural models to perform deep semantic matching from raw textual input, assuming no intermediate term representation and no access to structured external knowledge bases. We also show that Pageview frequency can also help improve the performance of evidence retrieval results, that later can be matched by using our neural semantic matching network. For claim verification, unlike previous approaches that simply feed upstream retrieved evidence and the claim to a natural language inference (NLI) model, we further enhance the NLI model by providing it with internal semantic relatedness scores (hence integrating it with the evidence retrieval modules) and ontological WordNet features. Experiments on the FEVER dataset indicate that (1) our neural semantic matching method outperforms popular TF-IDF and encoder models, by significant margins on all evidence retrieval metrics, (2) the additional relatedness score and WordNet features improve the NLI model via better semantic awareness, and (3) by formalizing all three subtasks as a similar semantic matching problem and improving on all three stages, the complete model is able to achieve the state-of-the-art results on the FEVER test set (two times greater than baseline results).1

Read full abstract

Term Vector Space Research Articles

Related Topics

Articles published on Term Vector Space

Enumeration of anti-invariant subspaces and Touchard's formula for the entries of the q-Hermite Catalan matrix

On The Fuzzy Weak Complex Vector Spaces

Infinite-dimensional linear algebra and solvability of partial differential equations

On a Characterization of Finite-Dimensional Vector Spaces

Predictive article recommendation using natural language processing and machine learning to support evidence updates in domain-specific knowledge graphs

Distributional semantic modeling: a revised technique to train term/word vector space models applying the ontology-related approach

A novel filter feature selection method using rough set for short text data

Induced L-bornological vector spaces and L-Mackey convergence1

Combining Fact Extraction and Verification with Neural Semantic Matching Networks

Twisted Hilbert spaces of 3d supersymmetric gauge theories

Salience Estimation via Variational Auto-Encoders for Multi-Document Summarization

The inner product on exterior powers of a complex vector space

Rapid polynomial approximation in $\boldsymbol{L_2}$-spaces with Freud weights on the real line

Efficient PET-CT image retrieval using graphs embedded into a vector space.

A characterisation of inner product spaces by the maximal circumradius of spheres

Evaluation of Co-occurring Terms in Clinical Documents Using Latent Semantic Indexing

Ring of normal cones

Monomial bases for broken circuit complexes

Spectral decomposition of real symmetric quadratic $\lambda $-matrices and its applications

Variational form of the large deviation functional

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Term Vector Space Research Articles

Related Topics

Articles published on Term Vector Space

Enumeration of anti-invariant subspaces and Touchard's formula for the entries of the q-Hermite Catalan matrix

On The Fuzzy Weak Complex Vector Spaces

Infinite-dimensional linear algebra and solvability of partial differential equations

On a Characterization of Finite-Dimensional Vector Spaces

Predictive article recommendation using natural language processing and machine learning to support evidence updates in domain-specific knowledge graphs

Distributional semantic modeling: a revised technique to train term/word vector space models applying the ontology-related approach

A novel filter feature selection method using rough set for short text data

Induced L-bornological vector spaces and L-Mackey convergence1

Combining Fact Extraction and Verification with Neural Semantic Matching Networks

Twisted Hilbert spaces of 3d supersymmetric gauge theories

Salience Estimation via Variational Auto-Encoders for Multi-Document Summarization

The inner product on exterior powers of a complex vector space

Rapid polynomial approximation in $\boldsymbol{L_2}$-spaces with Freud weights on the real line

Efficient PET-CT image retrieval using graphs embedded into a vector space.

A characterisation of inner product spaces by the maximal circumradius of spheres

Evaluation of Co-occurring Terms in Clinical Documents Using Latent Semantic Indexing

Ring of normal cones

Monomial bases for broken circuit complexes

Spectral decomposition of real symmetric quadratic $\lambda $-matrices and its applications

Variational form of the large deviation functional