Traditional Vector Space Research Articles

The increasing concern with misinformation has stimulated research efforts on automatic fact checking. The recentlyreleased FEVER dataset introduced a benchmark factverification task in which a system is asked to verify a claim using evidential sentences from Wikipedia documents. In this paper, we present a connected system consisting of three homogeneous neural semantic matching models that conduct document retrieval, sentence selection, and claim verification jointly for fact extraction and verification. For evidence retrieval (document retrieval and sentence selection), unlike traditional vector space IR models in which queries and sources are matched in some pre-designed term vector space, we develop neural models to perform deep semantic matching from raw textual input, assuming no intermediate term representation and no access to structured external knowledge bases. We also show that Pageview frequency can also help improve the performance of evidence retrieval results, that later can be matched by using our neural semantic matching network. For claim verification, unlike previous approaches that simply feed upstream retrieved evidence and the claim to a natural language inference (NLI) model, we further enhance the NLI model by providing it with internal semantic relatedness scores (hence integrating it with the evidence retrieval modules) and ontological WordNet features. Experiments on the FEVER dataset indicate that (1) our neural semantic matching method outperforms popular TF-IDF and encoder models, by significant margins on all evidence retrieval metrics, (2) the additional relatedness score and WordNet features improve the NLI model via better semantic awareness, and (3) by formalizing all three subtasks as a similar semantic matching problem and improving on all three stages, the complete model is able to achieve the state-of-the-art results on the FEVER test set (two times greater than baseline results).1

Read full abstract

Evolutionary Algorithms (EA) have been developing rapidly as a powerful and general learning approach which has been used successfully to find a reasonable solution for data mining and knowledge discovery. Genetic algorithm (GA) is a kind of mainstream EA paradigm with a purpose of developing solutions for optimization problems. Clustering ensembles have emerged as an outstanding algorithm in machine learning to leverage the consensus across multiple clustering solutions and combines their predictions into a single solution with improved robustness, stability and accuracy. Multimedia advancement and popularity of the social Web has collectively provided an easy way to generate bulk of videos. Categorization of such Web videos has become a hot research challenge. In this paper, we propose a Semi-supervised Evolutionary Ensemble (SS-EE) framework for social media mining, e.g., Web Video Categorization (WVC), using their low cost textual features, intrinsic relations and extrinsic Web support. The contributions of this research work are as follows. First, we extend the traditional Vector Space Model (VSM) to Semantic VSM (S-VSM) by considering the semantic similarity between the feature terms using Normalized Google Distance (NGD) approach. Second, we define a new distance measure, Triangular Similarity (TrS) between two Textual Feature Vectors (TFV) based on the frequencies of most relevant terms in each category. Third, we iterate the clustering ensemble process with the help of GA guided by a new measure, Pre-Paired Percentage (PPP), to be used as the fitness function during the genetic cycle. Fourth, in the key steps of the GA, crossover and mutation genetic operators, we define them by an intelligent mechanism of clustering ensemble. Fifth, in order to terminate the genetic cycle, we define another new measure, Clustering Quality (Cq), based on similarity matrix and clustering labels. Experiments on real world social-Web data (YouTube) have been performed to validate the SS-EE framework.

Read full abstract

Traditional Vector Space Research Articles

Related Topics

Articles published on Traditional Vector Space

Automatic classification of document resources based on Naive Bayesian classification algorithm

Text Clustering and Economic Analysis of Free Trade Zone Governance Strategies Based on Random Matrix and Subject Analysis

Text Feature Extraction for Public English Vocabulary Based on Wavelet Transform.

Intelligent Analysis and Positioning of Political Public Opinion in Universities

Biomedical Document Clustering Based on Accelerated Symbiotic Organisms Search Algorithm

EWNStream+: Effective and Real-time Clustering of Short Text Streams Using Evolutionary Word Relation Network

Combining Fact Extraction and Verification with Neural Semantic Matching Networks

Application of improved distributed naive Bayesian algorithms in text classification

Suitability and importance of deep learning feature space in the domain of text categorisation

Suitability and importance of deep learning feature space in the domain of text categorisation

Practical Text Phylogeny for Real-World Settings

Predicting users’ demographic characteristics in a Chinese social media network

Kernel-based consensus clustering for ontology-embedded document repository of power substations

Semantic association ranking schemes for information retrieval applications using term association graph representation

A Semantic Aspect-Based Vector Space Model to Identify the Event Evolution Relationship within Topics

Semi-supervised evolutionary ensembles for Web video categorization

Information Ordering with an Event‐Enriched Vector Space Model for Multi‐Document News Summarization

A Novel Method for Text Similarity Calculation

On Multi Class Vector Space Model-based Information Retrieval

LDA boost classification: boosting by topics

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Traditional Vector Space Research Articles

Related Topics

Articles published on Traditional Vector Space

Automatic classification of document resources based on Naive Bayesian classification algorithm

Text Clustering and Economic Analysis of Free Trade Zone Governance Strategies Based on Random Matrix and Subject Analysis

Text Feature Extraction for Public English Vocabulary Based on Wavelet Transform.

Intelligent Analysis and Positioning of Political Public Opinion in Universities

Biomedical Document Clustering Based on Accelerated Symbiotic Organisms Search Algorithm

EWNStream+: Effective and Real-time Clustering of Short Text Streams Using Evolutionary Word Relation Network

Combining Fact Extraction and Verification with Neural Semantic Matching Networks

Application of improved distributed naive Bayesian algorithms in text classification

Suitability and importance of deep learning feature space in the domain of text categorisation

Suitability and importance of deep learning feature space in the domain of text categorisation

Practical Text Phylogeny for Real-World Settings

Predicting users’ demographic characteristics in a Chinese social media network

Kernel-based consensus clustering for ontology-embedded document repository of power substations

Semantic association ranking schemes for information retrieval applications using term association graph representation

A Semantic Aspect-Based Vector Space Model to Identify the Event Evolution Relationship within Topics

Semi-supervised evolutionary ensembles for Web video categorization

Information Ordering with an Event‐Enriched Vector Space Model for Multi‐Document News Summarization

A Novel Method for Text Similarity Calculation

On Multi Class Vector Space Model-based Information Retrieval

LDA boost classification: boosting by topics