Word Representation Learning Research Articles

Due to the increasing use of information technologies by biomedical experts, researchers, public health agencies, and healthcare professionals, a large number of scientific literatures, clinical notes, and other structured and unstructured text resources are rapidly increasing and being stored in various data sources like PubMed. These massive text resources can be leveraged to extract valuable knowledge and insights using machine learning techniques. Recent advancement in neural network-based classification models has gained popularity which takes numeric vectors (aka word representation) of training data as the input to train classification models. Better the input vectors, more accurate would be the classification. Word representations are learned as the distribution of words in an embedding space, wherein each word has its vector and the semantically similar words based on the contexts appear nearby each other. However, such distributional word representations are incapable of encapsulating relational semantics between distant words. In the biomedical domain, relation mining is a well-studied problem which aims to extract relational words, which associates distant entities generally representing the subject and object of a sentence. Our goal is to capture the relational semantics information between distant words from a large corpus to learn enhanced word representation and employ the learned word representation for various natural language processing tasks such as text classification. In this article, we have proposed an application of biomedical relation triplets to learn word representation through incorporating relational semantic information within the distributional representation of words. In other words, the proposed approach aims to capture both distributional and relational contexts of the words to learn their numeric vectors from text corpus. We have also proposed an application of the learned word representations for text classification. The proposed approach is evaluated over multiple benchmark datasets, and the efficacy of the learned word representations is tested in terms of word similarity and concept categorization tasks. Our proposed approach provides better performance in comparison to the state-of-the-art GloVe model. Furthermore, we have applied the learned word representations to classify biomedical texts using four neural network-based classification models, and the classification accuracy further confirms the effectiveness of the learned word representations by our proposed approach.

Read full abstract

BackgroundInformation about a new coronavirus emerged in 2019 and rapidly spread around the world, gaining significant public attention and attracting negative bias. The use of stigmatizing language for the purpose of blaming sparked a debate.ObjectiveThis study aims to identify social stigma and negative sentiment toward the blameworthy agents in social communities.MethodsWe enabled a tailored text-mining platform to identify data in their natural settings by retrieving and filtering online sources, and constructed vocabularies and learning word representations from natural language processing for deductive analysis along with the research theme. The data sources comprised of ten news websites, eleven discussion forums, one social network, and two principal media sharing networks in Taiwan. A synthesis of news and social networking analytics was present from December 30, 2019, to March 31, 2020.ResultsWe collated over 1.07 million Chinese texts. Almost two-thirds of the texts on COVID-19 came from news services (n=683,887, 63.68%), followed by Facebook (n=297,823, 27.73%), discussion forums (n=62,119, 5.78%), and Instagram and YouTube (n=30,154, 2.81%). Our data showed that online news served as a hotbed for negativity and for driving emotional social posts. Online information regarding COVID-19 associated it with China—and a specific city within China through references to the “Wuhan pneumonia”—potentially encouraging xenophobia. The adoption of this problematic moniker had a high frequency, despite the World Health Organization guideline to avoid biased perceptions and ethnic discrimination. Social stigma is disclosed through negatively valenced responses, which are associated with the most blamed targets.ConclusionsOur sample is sufficiently representative of a community because it contains a broad range of mainstream online media. Stigmatizing language linked to the COVID-19 pandemic shows a lack of civic responsibility that encourages bias, hostility, and discrimination. Frequently used stigmatizing terms were deemed offensive, and they might have contributed to recent backlashes against China by directing blame and encouraging xenophobia. The implications ranging from health risk communication to stigma mitigation and xenophobia concerns amid the COVID-19 outbreak are emphasized. Understanding the nomenclature and biased terms employed in relation to the COVID-19 outbreak is paramount. We propose solidarity with communication professionals in combating the COVID-19 outbreak and the infodemic. Finding solutions to curb the spread of virus bias, stigma, and discrimination is imperative.

Read full abstract

Word Representation Learning Research Articles

Related Topics

Articles published on Word Representation Learning

Optimizing word embeddings for small datasets: a case study on patient portal messages from breast cancer patients

Optimizing Word Embeddings for Patient Portal Message Datasets with a Small Number of Samples.

A study on the innovative model of foreign language teaching in universities using big data corpus

The Role of Preprocessing for Word Representation Learning in Affective Tasks

Biomedical Text Classification Using Augmented Word Representation Based on Distributional and Relational Contexts.

Word Representation Learning in Multimodal Pre-Trained Transformers: An Intrinsic Evaluation

A Sequential Graph Neural Network for Short Text Classification

Detecting Fake News Over Job Posts via Bi-Directional Long Short-Term Memory (BIDLSTM)

Freshman or Fresher? Quantifying the Geographic Variation of Language in Online Social Media

Modeling multi-prototype Chinese word representation learning for word similarity

Word Representation Learning Based on Bidirectional GRUs With Drop Loss for Sentiment Classification

Domain structure-based transfer learning for cross-domain word representation

Learning word representation by jointly using neighbor and syntactic contexts

Sememe knowledge computation: a review of recent advances in application and expansion of sememe knowledge bases

Deep analysis of word sense disambiguation via semi-supervised learning and neural word representations

GEPC: Global embeddings with PID control

Learning emotional word embeddings for sentiment analysis

A Position Weighted Information Based Word Embedding Model for Machine Translation

Communicative Blame in Online Communication of the COVID-19 Pandemic: Computational Approach of Stigmatizing Cues and Negative Sentiment Gauged With Automated Analytic Techniques.

‟Deep lexicography” – Fad or Opportunity?

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Word Representation Learning Research Articles

Related Topics

Articles published on Word Representation Learning

Optimizing word embeddings for small datasets: a case study on patient portal messages from breast cancer patients

Optimizing Word Embeddings for Patient Portal Message Datasets with a Small Number of Samples.

A study on the innovative model of foreign language teaching in universities using big data corpus

The Role of Preprocessing for Word Representation Learning in Affective Tasks

Biomedical Text Classification Using Augmented Word Representation Based on Distributional and Relational Contexts.

Word Representation Learning in Multimodal Pre-Trained Transformers: An Intrinsic Evaluation

A Sequential Graph Neural Network for Short Text Classification

Detecting Fake News Over Job Posts via Bi-Directional Long Short-Term Memory (BIDLSTM)

Freshman or Fresher? Quantifying the Geographic Variation of Language in Online Social Media

Modeling multi-prototype Chinese word representation learning for word similarity

Word Representation Learning Based on Bidirectional GRUs With Drop Loss for Sentiment Classification

Domain structure-based transfer learning for cross-domain word representation

Learning word representation by jointly using neighbor and syntactic contexts

Sememe knowledge computation: a review of recent advances in application and expansion of sememe knowledge bases

Deep analysis of word sense disambiguation via semi-supervised learning and neural word representations

GEPC: Global embeddings with PID control

Learning emotional word embeddings for sentiment analysis

A Position Weighted Information Based Word Embedding Model for Machine Translation

Communicative Blame in Online Communication of the COVID-19 Pandemic: Computational Approach of Stigmatizing Cues and Negative Sentiment Gauged With Automated Analytic Techniques.

‟Deep lexicography” – Fad or Opportunity?