Cross-lingual Setting Research Articles

Semantic representation is a way of expressing the meaning of a text that can be processed by a machine to serve a particular natural language processing (NLP) task that usually requires meaning comprehension such as text summarisation, question answering or machine translation. In this paper, we present a semantic parsing model based on neural networks to obtain semantic representation of a given sentence. We utilise semantic representation of each sentence to generate semantically informed sentence embeddings for extrinsic evaluation of the proposed semantic parser, in particular for the semantic textual similarity task. Our neural parser utilises self-attention mechanism to learn semantic relations between words in a sentence to generate semantic representation of a sentence in UCCA (Universal Conceptual Cognitive Annotation) semantic annotation framework (Abend and Rappoport, 2013), which is a cross-linguistically applicable graph-based semantic representation. The UCCA representations are conveyed into a Siamese Neural Network built on top of two Recursive Neural Networks (Siamese-RvNN) to derive semantically informed sentence embeddings which are evaluated on semantic textual similarity task. We conduct both single-lingual and cross-lingual experiments with zero-shot and few-shot learning, which have shown superior performance even in low-resource scenario. The experimental results show that the proposed self-attentive neural parser outperforms the other parsers in the literature on English and German, and shows significant improvement in the cross-lingual setting for French which has comparatively low sources. Moreover, the results obtained from other downstream tasks such as sentiment analysis confirm that semantically informed sentence embeddings provide higher-quality embeddings compared to other pre-trained models such as SBERT (Reimers et al., 2019) or SimCSE (Gao et al., 2021), which do not utilise such structured information.

Read full abstract

The rapid growth in the digital era initiates the need to inculcate and preserve the academic originality of translated texts. Cross-lingual semantic similarity is concerned with identifying the degree of similarity of textual pairs written in two different languages and determining whether they are plagiarized. Unlike existing approaches, which exploit lexical and syntax features for mono-lingual similarity, this work proposed rich semantic features extracted from cross-language textual pairs, including topic similarity, semantic role labeling, spatial role labeling, named entities recognition, bag-of-stop words, bag-of-meanings for all terms, n-most frequent terms, n-least frequent terms, and different sets of their combinations. Knowledge-based semantic networks such as BabelNet and WordNet were used for computing semantic relatedness across different languages. This paper attempts to investigate two tasks, namely, cross-lingual semantic text similarity (CL-STS) and plagiarism detection and judgement (PD) using deep neural networks, which, to the best of our knowledge, have not been implemented before for STS and PD in cross-lingual setting, and using such combination of features. For this purpose, we proposed different neural network architectures to solve the PD task as either binary classification (plagiarism/independently written), or even deeper classification (literally translated/paraphrased/summarized/independently written). Deep neural networks were also used as regressors to predict semantic connotations for CL-STS tasks. Experimental results were performed on a large number of handmade data taken from multiple sources consisting of 71,910 Arabic-English pairs. Overall, experimental results showed that using deep neural networks with rich semantic features achieves encouraging results in comparison to the baselines. The proposed classifiers and regressors tend to show comparable performances when using different architectures of neural networks, but both the binary and multi-class classifiers outperform the regressors. Finally, the evaluation and analysis of using different sets of features reflected the supremacy of deeper semantic features on the classification results.

Read full abstract

Cross-lingual Setting Research Articles

Related Topics

Articles published on Cross-lingual Setting

Evaluating Neural Networks’ Ability to Generalize against Adversarial Attacks in Cross-Lingual Settings

Enhancing Cross-Lingual Sarcasm Detection by a Prompt Learning Framework with Data Augmentation and Contrastive Learning

A syntax-enhanced parameter generation network for multi-source cross-lingual event extraction

Cross-lingual few-shot sign language recognition

Visual-linguistic-stylistic Triple Reward for Cross-lingual Image Captioning

Multilingual Coreference Resolution in Multiparty Dialogue

Monolingual, multilingual and cross-lingual code comment classification

Cross-Lingual and Cross-Domain Crisis Classification for Low-Resource Scenarios

A Siamese Neural Network for Learning Semantically-Informed Sentence Embeddings

Language Model Priming for Cross-Lingual Event Extraction

Using Semi-Supervised Learning and Wikipedia to Train an Event Argument Extraction System

Event-Argument Linking in Disaster Domain

Unsupervised cross-lingual model transfer for named entity recognition with contextualized word representations.

Towards multidomain and multilingual abusive language detection: a survey

Cross-Lingual Voice Conversion With Controllable Speaker Individuality Using Variational Autoencoder and Star Generative Adversarial Network

Semi-supervised disentangled framework for transferable named entity recognition

Multi-SimLex: A Large-Scale Evaluation of Multilingual and Crosslingual Lexical Semantic Similarity

Evaluating cross-lingual textual similarity on dictionary alignment problem

Identifying cross-lingual plagiarism using rich semantic features and deep neural networks: A study on Arabic-English plagiarism cases

Cross-Lingual Natural Language Generation via Pre-Training

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Cross-lingual Setting Research Articles

Related Topics

Articles published on Cross-lingual Setting

Evaluating Neural Networks’ Ability to Generalize against Adversarial Attacks in Cross-Lingual Settings

Enhancing Cross-Lingual Sarcasm Detection by a Prompt Learning Framework with Data Augmentation and Contrastive Learning

A syntax-enhanced parameter generation network for multi-source cross-lingual event extraction

Cross-lingual few-shot sign language recognition

Visual-linguistic-stylistic Triple Reward for Cross-lingual Image Captioning

Multilingual Coreference Resolution in Multiparty Dialogue

Monolingual, multilingual and cross-lingual code comment classification

Cross-Lingual and Cross-Domain Crisis Classification for Low-Resource Scenarios

A Siamese Neural Network for Learning Semantically-Informed Sentence Embeddings

Language Model Priming for Cross-Lingual Event Extraction

Using Semi-Supervised Learning and Wikipedia to Train an Event Argument Extraction System

Event-Argument Linking in Disaster Domain

Unsupervised cross-lingual model transfer for named entity recognition with contextualized word representations.

Towards multidomain and multilingual abusive language detection: a survey

Cross-Lingual Voice Conversion With Controllable Speaker Individuality Using Variational Autoencoder and Star Generative Adversarial Network

Semi-supervised disentangled framework for transferable named entity recognition

Multi-SimLex: A Large-Scale Evaluation of Multilingual and Crosslingual Lexical Semantic Similarity

Evaluating cross-lingual textual similarity on dictionary alignment problem

Identifying cross-lingual plagiarism using rich semantic features and deep neural networks: A study on Arabic-English plagiarism cases

Cross-Lingual Natural Language Generation via Pre-Training